1 Introduction

Lattice QCD with Wilson fermions is a long-established regularisation. The fermionic action satisfies most desirable properties, namely strict locality, lack of fermion doublers, and preservation of flavour symmetry in a straightforward way. Well-known shortcomings are the presence of discretisation effects linear in the lattice spacing and, most importantly, the loss of chiral symmetry. The first problem is solved by applying the Symanzik-improvement programme (see for instance Ref. [1] for a review and Ref. [2] for more details). Chiral symmetry is recovered in the continuum, at the cost of having to deal with complicated renormalisation properties for most quantities of interest (cf. Ref. [3] and references therein; for a review see also Ref. [4]). A frequently cited example of these complications is the power divergence \(m_{\mathrm{crit}} \sim 1/a\), which must be subtracted from bare quark masses before they are renormalised multiplicatively. Other examples are the normalisation parameter \(Z_{\mathrm {A}}\) of the axial current and the ratio \(Z_{\mathrm {S}}/Z_{\mathrm {P}}\) of the non-singlet scalar and pseudoscalar density renormalisation parameters. In a regularisation scheme which respects chiral symmetry, these quantities are strictly equal to unity at finite values of the UV cutoff. With Wilson fermions these quantities are scale-independent finite functions of the gauge coupling, which tend to unity as we approach the continuum limit. In principle they are determined by requiring that chiral Ward identities at non-vanishing lattice spacing tend to their formal counter-parts in the continuum limit. The scope of this paper is to provide a method for the determination of \(Z_{\mathrm {S}}/Z_{\mathrm {P}}\) based on Ward identities on physically small lattices with Schrödinger functional boundary conditions and realising a line of constant physics (LCP) in parameter space. Results are obtained for \(N_{\mathrm{f}}=3\) dynamical quarks.

The general idea behind using chiral Ward identities in order to evaluate \(Z_{\mathrm {S}}/Z_{\mathrm {P}}\) for Wilson fermions appeared in Ref. [3].Footnote 1 It has been put to practice with quenched, unimproved Wilson fermions in Ref. [5] and subsequently with tree-level Symanzik-improved ones in Ref. [6]. The chiral Ward identities in question were obtained for large-volume lattices with periodic boundary conditions and non-chiral quark masses. Ratios of \(Z_{\mathrm {S}}/Z_{\mathrm {P}}\) were calculated at fixed gauge coupling for several quark masses and extrapolated to the chiral limit. A second-generation of calculations was not based on Ward identities but obtained by computing \(Z_{\mathrm {S}}\) and \(Z_{\mathrm {P}}\) in the RI/MOM scheme [7]. Again these calculations are performed at finite quark masses, followed by chiral extrapolations. A well known problem in this approach is that the \(Z_{\mathrm {S}}/Z_{\mathrm {P}}\) ratio thus obtained differs from the Ward identity one by “Goldstone pole contaminations” at the IR end of a renormalisation window. This problem was first identified in Ref. [7], and subsequently discussed in Refs. [8,9,10,11] (and reviewed in Ref. [4]), while the discussion specific to the difference between Ward identity and RI/MOM determinations of the ratio \(Z_{\mathrm {S}}/Z_{\mathrm {P}}\) is found in Ref. [10]. Although the problem is greatly attenuated by the RI/SMOM variant of this method [12], the requirement of a reliable renormalisation window is inherent in these approaches.

In the present work we revisit the Ward identity method, with an important novelty: lattices with small physical volumes and Schrödinger functional boundary conditions are used, with quark flavours degenerate in mass and (almost) at the chiral limit. In doing so, we follow closely the method introduced in Ref. [13] (and originally applied in the quenched approximation in that work) for the non-perturbative determination of the scale independent normalisation parameter \(Z_{\mathrm {A}}\) of the axial vector current. Updates and optimisations of these computations can be found in refs. [14, 15] for two- and three-flavour QCD, respectively. Ward identities are imposed at constant physics to ensure a removal of \({\mathrm {O}}(a)\) effects in on-shell quantities and, at the same time, smoothly vanishing \({\mathrm {O}}(a^2)\) effects as the bare coupling is varied. It must be stressed that the chiral Ward identities adopted in these works to determine \(Z_{\mathrm {A}}\) are valid for \(N_{\mathrm{f}}\ge 2\) quark flavours, while the ones we introduce in the present work for the determination of \(Z_{\mathrm {S}}/(Z_{\mathrm {P}} Z_{\mathrm {A}})\) are valid for \(N_{\mathrm{f}}\ge 3\).

We note in passing that, based on the chirally rotated Schrödinger functional construction of Ref. [16], a more recent method for the non-perturbative computation of \(Z_{\mathrm {S}}/Z_{\mathrm {P}}\) has been mentioned in Ref. [17].

This paper is organised as follows: in Sect. 2 (Sect. 2.1) we formally derive chiral Ward identities for continuum QCD, which relate correlation functions of non-singlet pseudoscalar and scalar composite operators (densities). The former are correlation functions with two operator insertions at two distinct space-time points (an axial current and a pseudoscalar density) in the presence of a generic external source operator. The latter involve a single insertion of the scalar operator. Subsequently (Sect. 2.2), we rewrite the same Ward identities in the lattice-regularised QCD with Wilson fermions. The external source consists of two standard Schrödinger functional boundary sources, each placed at a temporal boundary. The loss of chiral symmetry by Wilson fermions is taken into account by the renormalisation constants \(Z_{\mathrm {P}}\) and \(Z_{\mathrm {S}}\) of the pseudoscalar and scalar densities and the normalisation of the axial current, \(Z_{\mathrm {A}}\). In the chiral limit, these Ward identities hold up to \(\mathrm{O}(a^2)\) discretisation effects. We also discuss the corrections arising in practical simulations, which slightly deviate from the chiral limit; these are \(\mathrm{O}(am,a^2)\). Finally, in Sect. 2.3 we re-express these Ward identities in terms of traces of valence quark propagators, which multiply factored-out traces of generators of the \(SU(N_{\mathrm{f}})\) flavour group.

Section 3 takes an even closer look at these Ward identities. We distinguish several equivalence classes, each consisting of identities with different flavour structure, which reduce to the same relations between correlation functions, giving the same \(Z_{\mathrm {S}}/(Z_{\mathrm {P}} Z_{\mathrm {A}})\) result. Ward identities belonging to different equivalence classes provide \(Z_{\mathrm {S}}/(Z_{\mathrm {P}} Z_{\mathrm {A}})\) estimates which differ by \(\mathrm{O}(am,a^2)\) effects. If we neglect these effects, we can combine identities from different equivalence classes, ending up with new relations between correlation functions (true up to \(\mathrm{O}(am,a^2)\) errors). Thus we can explore to what extent different equivalence classes provide independent estimates of \(Z_{\mathrm {S}}/(Z_{\mathrm {P}} Z_{\mathrm {A}})\). Some of these estimates are expected to be noisier than others, as they are obtained using both quark-connected and quark-disconnected correlation functions.

In Sect. 4 we present our results for QCD with \(N_{\mathrm{f}}= 3\) dynamical flavours, where the lattice gauge action is tree-level Symanzik-improved and the fermion action is non-perturbatively Wilson–Clover improved. Our simulations are performed with degenerate mass flavours lying close to the chiral limit. The non-perturbative determination of the ratio \(Z_{\mathrm {S}}/Z_{\mathrm {P}}\) is carried out along a line of constant physics in parameter space. In practice, this requirement is met by ensuring a volume of almost constant spatial extent \(L\sim 1.2\,\)fm in physical units, with Schrödinger functional boundary conditions. The ratio between temporal and spatial extent T/L is also kept fixed. This implies that any remaining intrinsic ambiguities in \(Z_{\mathrm {S}}/Z_{\mathrm {P}}\) of \(\mathrm{O}(a^2)\) or higher (in the \(\mathrm{O}(a)\) improved setup adopted here) disappear smoothly towards the continuum limit. The gauge couplings of our simulations span a range typical for the computations performed by the CLS (Coordinated Lattice Simulations) effort in QCD with \(N_{\mathrm{f}}=2+1\) flavours of non-perturbatively improved Wilson fermions [18,19,20,21]. Our \(Z_{\mathrm {S}}/(Z_{\mathrm {P}} Z_{\mathrm {A}})\) results are divided out by \(Z_{\mathrm {A}}\), estimated in Ref. [22]. Our \(Z_{\mathrm {S}}/Z_{\mathrm {P}}\) estimates are subsequently extrapolated to the chiral limit at fixed \(g_0^2\). Results are obtained from several Ward identities; they differ by discretisation effects. Thus it is possible to create ratios of the different \(Z_{\mathrm {S}}/Z_{\mathrm {P}}\) determinations, and plot them against (powers of) the lattice spacing, confirming the expected scaling behaviour. The statistically and systematically most precise \(Z_{\mathrm {S}}/Z_{\mathrm {P}}\) determination is parameterised as a continuous function of \(g_0^2\), which is our final answer. This is compared to two other determinations: one is based on ratios of PCAC quark masses with different flavours, employing essentially the same small-volume Schrödinger functional setup [23]; the other is based on the relation between bare current quark masses and bare subtracted quark masses, computed on large volumes with open boundary conditions [20].

Finally, in Sect. 5 we discuss how \(Z_{\mathrm {S}}/Z_{\mathrm {P}}\) can be used in quark mass determinations along the lines proposed in Ref. [24], but performing the mass renormalisation in the Schrödinger functional scheme and the renormalisation group running non-perturbatively, between renormalisation scales \(\mu _{{\mathrm {had}}} \sim \Lambda _{{\mathrm {QCD}}}\) and \(\mu _{\mathrm {PT}} \sim M_{{\mathrm {W}}}\). Such a calculation is subjected to different systematics than the standard ALPHA-CLS method, recently applied in Ref. [25].

Work in progress culminating to this paper had been reported in Refs. [26, 27].

2 Chiral Ward identities for \(Z_{\mathrm {S}}/Z_{\mathrm {P}}\)

In this Section we will derive chiral Ward identities which relate correlation functions of non-singlet scalar and pseudoscalar composite operators (densities). These enable us to compute non-perturbatively the ratio \(Z_{\mathrm {S}}/Z_{\mathrm {P}}\), which determines the relative normalisation of these scalar and pseudoscalar densities when the regularisation (Wilson fermion action) breaks chiral symmetry. First we will derive the pertinent chiral Ward identities in the formal continuum theory. Subsequently, we will show their lattice analogues with Schrödinger functional boundary conditions. The resulting Ward identity computation of \(Z_{\mathrm {S}}/Z_{\mathrm {P}}\) follows very closely that of \(Z_{\mathrm {A}}\), described in refs. [13,14,15].

Our notation is pretty standard. Definitions of composite operators of dimension-3, axial transformations and Schrödinger functional (SF) boundary operators are collected in Appendix A. Conventions concerning the \(su(N_{\mathrm{f}})\) flavour algebra are to be found in Appendix B. The lattice spacing is denoted by a, the (squared) gauge coupling by \(g_0^2\), and the inverse lattice coupling by \(\beta \equiv 6/g_0^2\). Bare current (PCAC) and subtracted masses are defined in Appendix C.

2.1 Formal chiral Ward identities in the continuum

Under the small axial variations (A.8) of the fermion fields the formal, continuum QCD action in Euclidean space-time transforms as follows:

$$\begin{aligned} \delta _{{\mathrm {A}}} S&= \int d^4x \Big [ (\partial _\mu \epsilon ^a(x)) A_\mu ^a(x) + {\mathrm {i}} \epsilon ^a(x) \bar{\psi }(x) \{ T^a, M \} \gamma _5 \psi (x) \Big ] \nonumber \\&= \int d^4x \,\, \epsilon ^a(x) \,\, \Big [ - \partial _\mu A_\mu ^a(x) + 2m P^a(x) \Big ] \,. \end{aligned}$$
(1)

The fermion mass matrix is denoted by M. We work in the flavour symmetric (isospin) limit, so all quark masses m are degenerate. In the last expression we have integrated by parts the term with the axial current. Chiral Ward identities are obtained by considering that under the change of field variables defined in Eqs. (A.6), the expectation value of any composite operator \({\mathcal {O}}\) (and products of them) is invariant. In the limit of small axial variations this leads to:

$$\begin{aligned} \delta _{\mathrm {A}} \langle {\mathcal {O}}\rangle&= \dfrac{1}{\mathcal {Z}} \, \delta _{\mathrm {A}} \,\, \langle \,\, \int [{\mathcal {D}} \psi ] [{\mathcal {D}} \bar{\psi }] [{\mathcal {D}} G_\mu ] \,\, {\mathcal {O}} \,\, \exp (-S) \,\,\rangle = 0 \nonumber \\ \Rightarrow \,\, \langle \delta _{\mathrm {A}} {\mathcal {O}} \rangle&= \langle {\mathcal {O}}\,\, \delta _{\mathrm {A}} S \rangle \,. \end{aligned}$$
(2)

We now take the axial variations to be non zero only in a space-time region R with a smooth boundary \(\partial R\) (i.e., for \(x \in R\), \(\epsilon ^a(x) \ne 0\); otherwise \(\epsilon ^a(x) = 0\)). The above expression reduces to

$$\begin{aligned} \begin{aligned} \int _R d^4x \epsilon ^a(x) \Big [ \partial _\mu \langle A_\mu ^a(x) \,\, {\mathcal {O}}\rangle {-} 2m \langle P^a(x) \,\, {\mathcal {O}} \rangle \Big ] \,{=}\, {-} \, \langle \delta _{\mathrm {A}} {\mathcal {O}} \rangle \,. \end{aligned} \end{aligned}$$
(3)

We consider a product of composite operators \({\mathcal {O}}{=} P^b(y) {\mathcal {O}}_{\mathrm{ext}}\), where \(y \in R\) and \( {\mathcal {O}}_{\mathrm{ext}}\) is defined outside the region R. This implies that \(\delta _{\mathrm {A}} {\mathcal {O}}= [\delta _{\mathrm {A}} P^b(y)] {\mathcal {O}}_{\mathrm{ext}}\). The pseudoscalar density \(P^b(x)\) transforms as follows:

$$\begin{aligned} \delta _{\mathrm {A}} P^b(x) = \epsilon ^c(x) d^{cbe} S^e(x) + \epsilon ^c(x) \dfrac{\delta ^{cb}}{N_{\mathrm{f}}} \bar{\psi }(x) \psi (x) \,. \end{aligned}$$
(4)

At this stage we impose that \(\epsilon ^c(x) = \epsilon \delta ^{ac}\); i.e., it is a constant phase \(\epsilon \) in a fixed direction a in flavour space, so that Ward identities become expressions reflecting global chiral symmetry. Moreover, in order to sidestep a number of complications,Footnote 2 we chose \(a \ne b\), so that the last term on the r.h.s. of Eq. (4) drops out.Footnote 3 Putting everything together, we obtain

$$\begin{aligned}&\int _R d^4x \, \Big [ \partial _\mu \, \langle A_\mu ^a(x) P^b(y) {\mathcal {O}}_{\mathrm{ext}} \rangle - 2m \, \langle P^a(x) P^b(y) {\mathcal {O}}_{\mathrm{ext}} \rangle \Big ] \nonumber \\&\quad = - d^{abe} \langle S^e(y) {\mathcal {O}}_{\mathrm{ext}} \rangle \,. \end{aligned}$$
(5)

We note in passing that the first term is a surface term:

$$\begin{aligned} \begin{aligned}&\int _R d^4x \, \partial _\mu \, \langle A_\mu ^a(x) P^b(y) {\mathcal {O}}_{\mathrm{ext}} \rangle \\ {}&\quad =\int _{\partial R} d\sigma _\mu (x) \langle A_\mu ^a(x) P^b(y) {\mathcal {O}}_{\mathrm{ext}} \rangle \,. \end{aligned} \end{aligned}$$
(6)

As done in Ref. [13] for \(Z_{\mathrm {A}}\), we chose the region R to be the space-time volume between the hyper-planes at \(y_0-t\) and \(y_0+t\).Footnote 4 Boundary conditions in space are periodic, implying \(\int _R dx_0 d^3 x \partial _k \langle A_k \cdots \rangle = 0\). The Ward identity becomes

$$\begin{aligned} \begin{aligned}&\int d^3\mathbf{x} \Big \langle \Big [A_0^a(y_0+t;\mathbf{x}) \, -\, A_0^a(y_0-t;\mathbf{x}) \Big ] \, P^b(y_0;\mathbf{y}) \,\, {\mathcal {O}}_{\mathrm{ext}} \Big \rangle \\&\quad \quad - 2m \, \int d^3\mathbf{x} \int _{y_0-t}^{y_0+t} dx_0 \,\, \langle P^a(x_0;\mathbf{x}) P^b(y_0;\mathbf{y}) {\mathcal {O}}_{\mathrm{ext}} \rangle \\&\quad = - d^{abe} \langle S^e(y) {\mathcal {O}}_{\mathrm{ext}} \rangle \,. \end{aligned} \end{aligned}$$
(7)

It is convenient to introduce a spatial integration over \(\mathbf{y}\):

$$\begin{aligned}&\int d^3\mathbf{y} \int d^3\mathbf{x} \Big \langle \Big [A_0^a(y_0+t;\mathbf{x}) - A_0^a(y_0-t;\mathbf{x}) \Big ] P^b(y_0;\mathbf{y}) \, {\mathcal {O}}_{\mathrm{ext}} \Big \rangle \nonumber \\&\qquad -2m \, \int d^3\mathbf{y} \int d^3\mathbf{x} \int _{y_0-t}^{y_0+t} dx_0 \,\, \langle P^a(x_0;\mathbf{x}) P^b(y_0;\mathbf{y}) {\mathcal {O}}_{\mathrm{ext}} \rangle \nonumber \\&\quad = - d^{abe} \int d^3\mathbf{y} \,\, \langle S^e(y) {\mathcal {O}}_{\mathrm{ext}} \rangle \, . \end{aligned}$$
(8)

The second line of the l.h.s. contains a contact term, arising when \(r \equiv |x-y| \rightarrow 0\). The operator product is expressed in terms of an OPE (recall that \(a \ne b\))

$$\begin{aligned} \begin{aligned}&P^a(x) \,\, P^b(y) \sim d^{abe} \sum _{k=1}^\infty C_k Q^e_{k \, [D]} \, r^{D-6} \\&\quad = d^{abe} C_1 S^e(x) r^{-3} + \cdots \,, \end{aligned} \end{aligned}$$
(9)

where [D] is the operator dimension and the Wilson coefficients \(C_k\) contain logarithms. The most divergent term in the OPE, taking into account the various symmetry properties of the operator product, is proportional to \(S^e(x)\). The contribution to the space-time volume integral \(2m \int _R \cdots \) of a small four-sphere of centre x and radius a (or a small four-cube of size a) is then \(\sim ~m~\int _0^a dr~r^3~r^{D-6} \langle \cdots \rangle \sim m~a^{D-2}~ \langle \cdots \rangle \) and thus the leading term in the OPE contributes \(\mathrm{O}(am)\). In the lattice regularisation this implies that the contact term contributes an \(\mathrm{O}(am)\) discretisation effect to the Ward identity, even in a Symanzik-improved setup.

2.2 Lattice Ward identities with Schrödinger functional boundary conditions

We now adapt the previous formal manipulations to the lattice regularisation with Schrödinger functional boundary conditions. The external source for the Ward identity correlation functions is chosen to be a tensor in flavour space \({\mathcal {O}}^{ad}_{\mathrm{ext}}\):

$$\begin{aligned} {\mathcal {O}}^{ad}_{\mathrm{ext}} = \dfrac{1}{2 L^6} {\mathcal {O}}^{\prime a} {\mathcal {O}}^d \,, \end{aligned}$$
(10)

with \({\mathcal {O}}^{\prime a}\) and \({\mathcal {O}}^d\) defined in Eqs. (A.9). With this source and in lattice notation the Ward identity (8) becomes (with \(b \ne c\)):

$$\begin{aligned}&Z_{\mathrm {A}} Z_{\mathrm {P}} \, a^6 \, \nonumber \\&\qquad \times \Bigg \{ \sum _{\mathbf{x},\mathbf{y}} \, \langle {\mathcal {O}}^{\prime a} \Big [ (A_{\mathrm{I}})^b_0(y_0+t; \mathbf{x}) \nonumber \\&\qquad - (A_{\mathrm{I}})^b_0(y_0-t; \mathbf{x}) \Big ] P^c(y_0;\mathbf{y}) \, {\mathcal {O}}^d \rangle \nonumber \\&\qquad - 2m a \sum _{\mathbf{x},\mathbf{y}} \sum _{x_0 = y_0-t}^{y_0+t} w(x_0) \, \langle {\mathcal {O}}^{\prime a} P^b(x_0;\mathbf{x}) \, P^c(y_0;\mathbf{y}) \, {\mathcal {O}}^d \rangle \Bigg \}\nonumber \\&\quad = - d^{bce} Z_{\mathrm {S}} \,\, a^3 \sum _\mathbf{y} \langle {\mathcal {O}}^{\prime a} \, S^e(y) \, {\mathcal {O}}^d \rangle + \mathrm{O}(am,a^2) \,. \end{aligned}$$
(11)

In this expression, repeated flavour indices e are summed, as usual. The weight factor is \(w(x_0) = 1/2\) for \(x_0 = y_0 \pm t\) and \(w(x_0) = 1\) otherwise. It is introduced in order to implement the trapezoidal rule for discretising integrals. The mass m is the current quark mass defined in Eq. (C.5); recall that we work with degenerate masses.

Assuming that we work in the chiral limit (or with nearly vanishing quark masses, so that \(\mathrm{O}(am)\) effects may be safely neglected), the above Ward identity is valid up to \(\mathrm{O}(a^2)\) dicretisation errors in lattice QCD with Wilson quarks. Chiral symmetry breaking implies the (re)normalisation and improvement properties summarised in Appendix C. The Symanzik b-coefficients appearing in Eqs. (C.2)–(C.4) multiply the subtracted quark mass \(m_{\mathrm {q}}\) or the quark mass matrix \(M_{\mathrm {q}}\). When working in or close to the chiral limit, as is the case in our simulations, we may safely drop these terms. Putting everything together we obtain Ward identity (11). The renormalisation factors of the external sources \({\mathcal {O}}^{\prime a}\) and \({\mathcal {O}}^d\) are not taken into consideration, as they cancel out on both sides of the identity. Note that the term proportional to the current quark mass m may also be dropped in the chiral limit. In practice, since we are always working with masses that are not strictly zero, it turns out that it is advantageous to keep this term; see Ref. [15] and Sect. 4.1.

Equation (11) can be solved for \(Z_{\mathrm {S}}/(Z_{\mathrm {P}} Z_{\mathrm {A}})\). With \(Z_{\mathrm {A}}\) known either from other PCAC Ward identities [13,14,15] or from the chirally rotated Schrödinger functional formalism [22], we can thus obtain \(Z_{\mathrm {S}}/Z_{\mathrm {P}}\).

2.3 Lattice Ward identities, Wick contractions, and flavour factors

Fig. 1
figure 1

The trace diagrams contributing to the expectation values of Table 1. The leftmost (rightmost) wall is time-slice \(x_0 = 0\) (\(x_0 = T\)) with a \(\gamma _5\) Dirac matrix between circles. The hexagons in the bulk represent the insertions of a scalar operator S(y). The open circles correspond to the boundary fields \(\zeta \) (at \(x_0 = 0\)) and \(\zeta ^\prime \) (at \(x_0 = T\)), while the filled circles denote \(\bar{\zeta }\) (at \(x_0 = 0\)) and \(\bar{\zeta }^\prime \) (at \(x_0 = T\)). Quark-connected diagrams \( F_{{\mathrm {S;1}}}\) and \( F_{{\mathrm {S;2}}}\) are single traces, formed by starting from any point and following the lines (quark propagators) around until we close the loop. The quark-disconnected diagram is a product of two traces

Ward identity (11) relates expectation values of four composite operators on the l.h.s. to those of three composite operators on the r.h.s.; with a slight abuse of terminology, we call these four- and three-point correlation functions, respectively. We express these correlation functions, with Schrödinger functional boundary fields, in terms of traces of quark propagators. In standard ALPHA notation [28], \([\psi (y) \, \bar{\psi }(x)]_{\mathrm {F}}\) denotes a quark propagator in a fixed background gauge field configuration, where x and y are space-time points in the bulk of the lattice. Propagators from the \(x_0=0\) boundary to the bulk are \([\zeta (\mathbf{v}) \bar{\psi }(y)]_{{\mathrm {F}}}\) (with \(\mathbf{v}\) a point at the \(x_0=0\) boundary), while those from the \(x_0=T\) boundary to the bulk are \([\zeta ^\prime ({\mathbf{v}^\prime }) \bar{\psi }(y)]_{{\mathrm {F}}}\) (with \(\mathbf{v}^\prime \) a point at \(x_0=T\)). Boundary-to-boundary propagators are \([\zeta ^\prime ({\mathbf{v}^\prime }) \bar{\zeta }(\mathbf{u})]_{{\mathrm {F}}}\). For proper definitions see Ref. [28]. Note that, since we are working in the \(su(N_{\mathrm{f}})\)-symmetric limit, all masses are degenerate and quark propagators of different flavours are indistinguishable.Footnote 5

Performing the Wick contractions, we write the three-point correlation function of Eq. (11) as

$$\begin{aligned} \begin{aligned}&a^3 \sum _{\mathbf{y}}\langle {\mathcal {O}}^{\prime a} \, S^e(y) \,{\mathcal {O}}^d \rangle \\&\quad = - {\mathrm {i}} a^{15} \Big ( T^{dea} F_{{\mathrm {S;1}}}(y_0) +T^{aed} F_{{\mathrm {S;2}}}(y_0) \Big ) \,, \end{aligned} \end{aligned}$$
(12)

where \(T^{aed} \equiv \mathrm{Tr}(T^aT^eT^d)\) are traces of three flavour \(su(N_{\mathrm{f}})\) generators and \(F_{{\mathrm {S;1}}}(y_0),F_{{\mathrm {S;2}}}(y_0)\) are expectation values of traces of quark propagators with a scalar insertion. The exact expressions can be found in Table 1. Note that traces \(\mathrm{Tr}\) act in flavour space, traces \(\,\hbox {tr}\,\) act in spin-colour space, and \(\langle \cdots \rangle \) denote averages over gauge field configurations. In Fig. 1 we show the quark-line diagrams corresponding to the spin-colour traces in the above equation. Any Wick contraction between fermion fields at the same point in the bulk \([\psi (y), \bar{\psi }(y)]_{\mathrm{F}}\), or between boundary fields at the same time-slice (e.g. \([\zeta (\mathbf{v}) \bar{\zeta }(\mathbf{u})]_{\mathrm{F}}\)) gives rise to a quark-disconnected diagram,Footnote 6 multiplied by the trace of an \(su(N_{\mathrm{f}})\) generator. As this trace is zero, such diagrams do not contribute to the three-point correlation function. An example of such a diagram is shown in Fig. 1.

Table 1 Mathematical expressions for the diagrams \( F_{{\mathrm {S;}}k}\) depicted in Fig. 1 and the diagrams \(F_{{\mathrm {AP}};k}\) depicted in Fig. 2

In Appendix D we combine the usual \(\gamma _5\)-Hermiticity property of quark propagators, charge conjugation invariance of the lattice theory, and the trace properties of Eq. (B.4), to cast the r.h.s. of Eq. (12) into a single real term, and obtain for the r.h.s. of the Ward identity (11):

$$\begin{aligned} {\mathrm {WI}}~{\mathrm {r.h.s.}} = - \frac{a^{15}}{2} Z_{{\mathrm {S}}} d^{bce} d^{ade} \mathrm{Re}\,\Big [ F_{{\mathrm {S;1}}}(y_0) \Big ] \,. \end{aligned}$$
(13)

Next we concentrate on the l.h.s. of Eq. (11). For simplicity we drop, for the moment, the term proportional to the quark mass. The l.h.s. consists of boundary-to-boundary correlation functions with two insertions of dimension-3 operators in the bulk, which can be cast in the general form

$$\begin{aligned} \begin{aligned} a^6 \sum _{\mathbf {x}, \mathbf {y}} \langle \mathcal {O}^{\prime a} \, A_0^b(x) \, P^c(y) \,\mathcal {O}^d\rangle =\,a^{18}\sum _{k=1}^{9}T_k^{abcd} F_{{\mathrm {AP}};k}(x_0,y_0) \,. \end{aligned} \end{aligned}$$
(14)

Upon performing the Wick contractions, each correlation function is expressed as the sum of 9 terms. They are products of traces of flavour matrices (denoted as \(T_k^{abcd}\)) and traces of loops of quark propagators averaged over gauge field configurations (denoted as \(F_{{\mathrm {AP}};k}(x_0,y_0)\)). The former traces are defined as:

$$\begin{aligned} T_1^{abcd}&\equiv \mathrm{Tr}(T^aT^bT^cT^d) , \quad T_2^{abcd} \equiv \mathrm{Tr}(T^aT^dT^cT^b) \,, \end{aligned}$$
(15)
$$\begin{aligned} T_3^{abcd}&\equiv \mathrm{Tr}(T^aT^bT^dT^c), \quad T_4^{abcd} \equiv \mathrm{Tr}(T^aT^cT^dT^b) \,, \end{aligned}$$
(16)
$$\begin{aligned} T_5^{abcd}&\equiv \mathrm{Tr}(T^aT^cT^bT^d) , \quad T_6^{abcd} \equiv \mathrm{Tr}(T^aT^dT^bT^c) \,, \end{aligned}$$
(17)
$$\begin{aligned} T_7^{abcd}&\equiv \mathrm{Tr}(T^aT^b)\mathrm{Tr}(T^dT^c) \,, \end{aligned}$$
(18)
$$\begin{aligned} T_8^{abcd}&\equiv \mathrm{Tr}(T^aT^c)\mathrm{Tr}(T^dT^b) \,, \end{aligned}$$
(19)
$$\begin{aligned} T_9^{abcd}&\equiv \mathrm{Tr}(T^aT^d)\mathrm{Tr}(T^cT^b) \,, \end{aligned}$$
(20)

while the latter ones are also given in Table 1.

Fig. 2
figure 2

The trace diagrams contributing to the expectation values of Table 1. Conventions are similar to those of Fig. 1. The diamonds in the bulk represent the insertions of a pseudoscalar operator P(y). The squares in the bulk represent the insertions of an axial current \(A_0(x)\) or a pseudoscalar operator P(x) (giving rise to the Dirac matrices \(\gamma _0 \gamma _5\) or \(\gamma _5\), respectively). Quark-connected diagrams \(F_{{\mathrm {AP}};3}, F_{{\mathrm {AP}};5}, F_{{\mathrm {AP}};1}\) are single traces, formed by starting from any point and following the lines (quark propagators) around until we close the loop. Quark-disconnected diagrams \(F_{{\mathrm {AP}};8}, F_{{\mathrm {AP}};7}, F_{{\mathrm {AP}};9}\) are products of two traces. Diagrams \(F_{{\mathrm {AP}};2}, F_{{\mathrm {AP}};4}, F_{{\mathrm {AP}};6}\) are not shown, as they are related to \(F_{{\mathrm {AP}};1}, F_{{\mathrm {AP}};3}, F_{{\mathrm {AP}};5}\); cf. Eqs. (D.5)

The spin-colour trace diagrams are shown in Fig. 2. We see that there are six quark-connceted diagrams, and three quark-disconnected ones. The condition \(b \ne c\) implies that \(T_9F_{{\mathrm {AP}};9}(x_0,y_0) =0\), due to the vanishing of \({\mathrm {Tr}}(T^cT^b)\). From Eq. (B.2) we see that \(T_k^{abcd}\) for \(k=7,8\) are real.

Once more we combine \(\gamma _5\)-Hermiticity, charge conjugation invariance, and Eq. (B.5), to obtain for the l.h.s. of the Ward identity (11):

$$\begin{aligned}&{\mathrm {WI}}~{\mathrm {l.h.s.}} = Z_{\mathrm {A}} Z_{\mathrm {P}} \,\, a^{18} \nonumber \\&\quad {\times }\left[ \sum _{k{=}1,3,5} 2 \mathrm{Re}\,(T_k^{abcd}) \Big \{ F_{{\mathrm {AP}};k}(y_0{+}t,y_0) {-} F_{{\mathrm {AP}};k}(y_0-t,y_0) \Big \}\right. \nonumber \\&\quad \left. + \sum _{k=7}^8 T_k^{abcd} \Big \{ F_{{\mathrm {AP}};k}(y_0+t,y_0) - F_{{\mathrm {AP}};k}(y_0-t,y_0) \Big \}\right] \,. \end{aligned}$$
(21)

Note that correlation functions \(F_{{\mathrm {AP}};k}\) are real for \(k=1, \ldots , 9\). See Appendix D for more details. We will use a somewhat more compact notation, defining

$$\begin{aligned} \varDelta _k(y_0,t) \equiv F_{{\mathrm {AP}};k}(y_0+t,y_0) - F_{{\mathrm {AP}};k}(y_0-t,y_0) \,. \end{aligned}$$
(22)

Collecting Eqs. (13), (21), and (22), we write the Ward identity (11) in the chiral limit as:

$$\begin{aligned}&a^3 Z_{\mathrm {A}} Z_{\mathrm {P}} \nonumber \\&\qquad \times \Bigg [ \sum _{k=1,3,5} 2 \mathrm{Re}\,(T_k^{abcd}) \varDelta _k(y_0,t) + \sum _{k=7,8} T_k^{abcd} \varDelta _k(y_0,t) \Bigg ]\nonumber \\&\quad =-\dfrac{Z_{{\mathrm {S}}}}{2} d^{bce} d^{ade} \mathrm{Re}\,\Big [ F_{{\mathrm {S;1}}}(y_0) \Big ] + \mathrm{O}(a^2) \,. \end{aligned}$$
(23)

In order to keep the equation simple, we have not shown the mass-dependent terms with two pseudoscalar density insertions, appearing in Eq. (11). These terms are included in the numerical analysis, which is carried out close to, but not strictly at the chiral limit. The reader should have no difficulty convincing himself that they are exactly analogous to \(F_{{\mathrm {AP}};k}(y_0+t,y_0)\) and \( F_{{\mathrm {AP}};k}(y_0-t,y_0)\) appearing above. Their net effect is to add extra mass-dependent contributions to the \(\varDelta _k(y_0,t)\) functions. From now on, the \(\varDelta _k(y_0,t)\) functions are meant to include these contributions, proportional to the quark mass. Consequently, the uncertainty on the r.h.s. of Eq. (23) becomes \(\mathrm{O}(am,a^2)\).

It is interesting to compare the Ward identities we have derived here to the one introduced in Ref. [13] for the determination of \(Z_{\mathrm {A}}\). The former are valid for \(N_{\mathrm{f}}\ge 3\), while the latter for \(N_{\mathrm{f}}\ge 2\). The Ward identity of Ref. [13] involves correlation functions with two axial current insertions in the bulk. In our case we have more complicated contributions, consisting of time-differences of correlation functions with one axial current and one pseudoscalar density insertion.

Table 2 Ward identities WI(abcd) grouped into equivalence classes. Each class is labeled by four flavour indices abcd, of a representative element, listed in the leftmost column. All elements of the same class are grouped to the right. For more explanations, see text

3 Determination of \(Z_{\mathrm {S}}/(Z_{\mathrm {P}}Z_{\mathrm {A}})\) from Ward identities

Ward identity (23) is a master equation, from which a plethora of relations arise for specific choices of flavour indices abcd. In what follows, each of them will be distinguished by the label WI(abcd). Not all of them are suitable for the determination of \(Z_{\mathrm {S}}/Z_{\mathrm {P}}\). The following constraints need to be imposed:

  1. (i)

    \(b \ne c\); this ensures the suppression of the scalar term in Eq. (4);

  2. (ii)

    \(d^{bce} \ne 0\) and \(d^{ade} \ne 0\), so that the r.h.s. of Eq. (23) does not vanish. Note that once bc are fixed, property A in Appendix B ensures that \(d^{bce} \ne 0\) for a single value of e. Thus the summation over e on the r.h.s. of our master equation is trivial and the requirement \(d^{bce} d^{ade} \ne 0\) is satisfied for at most a single value of e;

  3. (iii)

    \(f^{bce} = 0\) for the choice of indices bce for which \(d^{bce} \ne 0\); \(f^{ade} = 0\) for the choice of indices ade for which \(d^{ade} \ne 0\). This follows from property B in Appendix B.

In spite of these constraints, a lot of freedom remains in the choice of flavour indices, resulting in many Ward identities. They are relations between the correlation functions of the master equation, which can be solved for \(Z_{\mathrm {S}}/(Z_{\mathrm {P}}Z_{\mathrm {A}})\). These Ward identities can be grouped into different equivalence classes. Each class consists of several identities WI(abcd) with different flavour indices \(a,\ldots ,d\), but identical flavour factors \(\mathrm{Re}\,(T_k)\) (\(k=1,3,5,7,8\)), and thus the same Eq. (23). Therefore, the same \(Z_{\mathrm {S}}/(Z_{\mathrm {P}}Z_{\mathrm {A}})\) estimate is obtained from all Ward identities of the same equivalence class. Estimates of \(Z_{\mathrm {S}}/(Z_{\mathrm {P}}Z_{\mathrm {A}})\) from Ward identities of different classes differ by discretisation effects.

Table 3 Classes of Ward identities (first column), the corresponding flavour factors of Eq. (23) (columns 2 to 6) and the product of symmetric tensors d of the same equation (last column)

The combinations of conditions (i)–(iii) simmer down to the choice of flavour indices (abcd), with \(b \ne c\), such that \(d^{bce} d^{ade}\ne 0\). We systematically investigated the choices of flavour indices which fulfill these conditions with a computer algebra program and grouped them into the equivalence classes which are tabulated in Table 2. These results depend on the \(su(N_{\mathrm{f}})\) Gell–Mann matrix definitions of Appendix B. Some interesting observations are:

  • There are pairs of equivalence classes that have the same number of elements. Examples are WI(1245) paired to WI(1425), WI(1144) paired to WI(1414) etc. These pairs of classes are separated by a single horizontal line in Table 2. Class WI(1468) does not have a partner.

  • The flavour factors \(\mathrm{Re}\,(T_k)\) for \((k = 1,3,5)\), \(T_7\), and \(T_8\) of paired classes have closely related numerical values; see Table 3. We will see below how this leads to useful relations between certain \(\varDelta _k\) functions.

  • The quark disconnected traces \(\varDelta _7\) and \(\varDelta _8\) do not contribute to the equivalence classes of the top half of Table 2 (separated by a triple line from the bottom half).

In Table 3 we collect the flavour factors \(\mathrm{Re}\,(T_k)\) (\(k=1,3,5\)), \(T_7\), and \(T_8\) for each class. Depending on the choice of flavour indices a,b,c,d, some of these flavour factors vanish. This simplifies the resulting Ward identity. Also here the top part of the Table (separated by a double line from the bottom half) lists the Ward identities without \(\varDelta _7\)- and \(\varDelta _8\)-type contributions.

There are two possible ways of using the 11 Ward identities of Table 3. A first approach would be to determine \(Z_{\mathrm {S}}/(Z_{\mathrm {P}}Z_{\mathrm {A}})\) from each of the 11 variants of Eq. (23). In principle these determinations differ by \(\mathrm{O}(am,a^2)\) effects and that should provide a handle for a good control of the related systematics. However, in practice the different \(Z_{{\mathrm {S}}}/(Z_{{\mathrm {P}}}Z_{{\mathrm {A}}})\) results are all obtained from the same configuration ensembles and are thus strongly correlated. Moreover, paired Ward identities (in the sense discussed above; cf. Table 2) have very similar relations between their \(\varDelta _k\)-terms and this also leads to very similar Z-ratios.

A second approach would be to combine these Ward identities in order to first obtain relations between the various \(\varDelta _k\)-terms. These would be true up to \(\mathrm{O}(am,a^2)\) at fixed gauge coupling, and once established, would simplify the equation(s) relating \(Z_{{\mathrm {S}}}/(Z_{{\mathrm {P}}}Z_{{\mathrm {A}}})\) to the \(\varDelta _k\)’s. In this spirit we proceed as follows:

  1. (i)

    Starting from Ward identities without quark disconnected contributions (i.e., with \(\mathrm{Re}\,(T_7) = \mathrm{Re}\,(T_8)=0\); top part of Table 3), we combine the pair WI(1245) and WI(1425) to obtain:

    $$\begin{aligned}&\varDelta _1(y_0,t) = \varDelta _5(y_0,t) +\mathrm{O}(am,a^2) \,, \end{aligned}$$
    (24)
    $$\begin{aligned}&Z_{\mathrm {A}} Z_{\mathrm {P}} a^3 \big [ \varDelta _1(y_0,t) - \varDelta _3(y_0,t) \big ] \nonumber \\&\quad = - Z_{\mathrm {S}} \mathrm{Re}\,\big [ F_{{\mathrm {S;1}}}(y_0) \big ] + \mathrm{O}(am,a^2) \,. \end{aligned}$$
    (25)

    Note that by combining the pair WI(1486) and WI(1846) we also obtain the above expressions, so this pair does not provide extra information.

  2. (ii)

    WI(1468), which has no partner, is written, in terms of the \(\varDelta \)’s defined in Eq. (22), as:

    $$\begin{aligned} \begin{aligned}&Z_{\mathrm {A}} Z_{\mathrm {P}} a^3 \big [ \varDelta _1(y_0,t) - 2 \varDelta _3(y_0,t) + \varDelta _5(y_0,t) \big ] \\&\quad = -2 \, Z_{\mathrm {S}} \mathrm{Re}\,\big [ F_{{\mathrm {S;1}}}(y_0) \big ] + \mathrm{O}(am,a^2) \,. \end{aligned} \end{aligned}$$
    (26)

    This on its own determines the ratio \(Z_{\mathrm {S}}/(Z_{\mathrm {P}}Z_{\mathrm {A}})\). Note that combined with Eq. (24), it gives us Eq. (25). Our conclusion is that all Ward identities with \(\mathrm{Re}\,(T_7) = \mathrm{Re}\,(T_8)=0\) reduce to the equality \(\varDelta _1 = \varDelta _5\) (i.e., diagrams \(F_{{\mathrm {AP}};1}\) and \(F_{{\mathrm {AP}};5}\) of Fig. 2 are related) and a single Ward identity, from which \(Z_{\mathrm {S}}/(Z_{\mathrm {P}}Z_{\mathrm {A}})\) may be computed.

  3. (iii)

    Passing to Ward identities with quark-disconnected contributions (bottom part of Table 3), we combine the pair WI(1188) and WI(1818) to obtain:

    $$\begin{aligned}&\varDelta _7(y_0,t) = \varDelta _8(y_0,t) +\mathrm{O}(am,a^2) \,, \end{aligned}$$
    (27)
    $$\begin{aligned}&Z_{\mathrm {A}} Z_{{\mathrm {P}}} a^3 \big [ 2 \varDelta _1(y_0,t) + \varDelta _3(y_0,t) + 3 \varDelta _7(y_0,t) \big ] \nonumber \\&\quad = -2 Z_{{\mathrm {S}}} \mathrm{Re}\,\big [ F_{{\mathrm {S;1}}}(y_0) \big ] + \mathrm{O}(am,a^2) \,, \end{aligned}$$
    (28)

    where Eq. (24) has also been used to arrive at Eq. (28).

  4. (iv)

    Similarly, the pair WI(1144) and WI(1414) combine to give

    $$\begin{aligned}&\varDelta _1(y_0,t) + 2 \varDelta _7(y_0,t) \nonumber \\&\quad =\varDelta _5(y_0,t) + 2 \varDelta _8(y_0,t) + \mathrm{O}(am,a^2) \,, \end{aligned}$$
    (29)
    $$\begin{aligned}&Z_{\mathrm {A}} Z_{\mathrm {P}} a^3 2 \big [ \varDelta _1(y_0,t) + \varDelta _3(y_0,t) + 2 \varDelta _7(y_0,t) \big ] \nonumber \\&\quad = -2 Z_{\mathrm {S}} \mathrm{Re}\,\big [ F_{{\mathrm {S;1}}}(y_0) \big ] + \mathrm{O}(am,a^2) \,. \end{aligned}$$
    (30)

    Eq. (29) carries no new information, as it is a combination of Eqs. (24) and (27).

  5. (v)

    If we now combine Eqs. (28) and (30), we obtain again Eq. (25) and the new relation

    $$\begin{aligned} \varDelta _3(y_0,t) = - \varDelta _7(y_0,t) +\mathrm{O}(am,a^2) \,. \end{aligned}$$
    (31)

The bottom line is that, up to \(\mathrm{O}(am,a^2)\) discretisation effects, the 11 Ward identities corresponding to the entries of Table 3 are not all independent. They can be combined to give three relations between the functions \(\varDelta _k\), which depend on traces of valence quark propagators, without references to flavour traces; these are Eqs. (24), (27), and (31).Footnote 7 The extent to which these relations are fulfilled at non-zero lattice spacing is an indicator of the size of discretisation effects. Moreover, if we take them at face value, the remaining Ward identities (25), (26), (28), and (30) reduce to a single expression. Any of them can be used to provide estimates of the ratio \(Z_{\mathrm {S}}/(Z_{\mathrm {P}}Z_{\mathrm {A}})\). We expect Eqs. (28), and (30) to be noisier, as they involve quark-disconnected diagrams. Eq. (25) seems promising, as it only involves \(\varDelta _1\) and \(\varDelta _3\), but it cannot be excluded a priori that Eq. (28) turns out to be better behaved. This can only be decided by numerical investigation.

Of course, these considerations do not exhaust all possibilities. Any linear combination of the Ward identities considered above, possibly combined with the relations (24), (27), (31), can be used for the computation of \(Z_{\mathrm {S}}/(Z_{\mathrm {P}}Z_{\mathrm {A}})\). For example, the linear combination \({\mathrm{L}}_1\equiv \) [WI(1245)−WI(1425)], combined with Eq. (24) gives:

$$\begin{aligned} \begin{aligned}&Z_{{\mathrm {A}}} Z_{{\mathrm {P}}} a^3 \big [ \varDelta _1(y_0,t) \big ] \\&\quad = - Z_{{\mathrm {S}}} \mathrm{Re}\,\big [ F_{{\mathrm {S;1}}}(y_0) \big ] + \mathrm{O}(am,a^2)\,. \end{aligned} \end{aligned}$$
(32)

The determination of \(Z_{{\mathrm {S}}}/(Z_{{\mathrm {P}}}Z_{{\mathrm {A}}})\) from the above depends only on quark-connected diagrams. Similarly, the linear combination \({\mathrm{L}}_2 \equiv \) [12WI(1818)−8WI(1414)] gives:

$$\begin{aligned} \begin{aligned}&Z_{{\mathrm {A}}} Z_{{\mathrm {P}}} a^3 \big [ \varDelta _1(y_0,t) + \varDelta _8(y_0,t) \big ] \\&\quad = - Z_{{\mathrm {S}}} \mathrm{Re}\,\big [ F_{{\mathrm {S;1}}}(y_0) \big ] + \mathrm{O}(am,a^2)\,, \end{aligned} \end{aligned}$$
(33)

which yields a \(Z_{\mathrm {S}}/(Z_{\mathrm {P}}Z_{\mathrm {A}})\) estimate from quark-connected and quark-disconnected diagrams. The last two expressions will be used in the following for numerical crosschecks.

4 Numerical setup and results

We investigate the proposed Ward identities on lattices with tree-level Symanzik improved gluons and Wilson-Clover quarks. The action coincides with the one used by CLS [18, 20, 21]. We employ Schrödinger functional boundary conditions in time, which enable us to simulate at quark masses close to the chiral point and control systematic effects related to the massless renormalisation framework. The details of this aspect are discussed in Sect. 4.1. Similar to the procedure in [15], we construct boundary-to-boundary three- and four-point functions with pseudoscalar Schrödinger functional wall sources and use wavefunctions at the boundaries as explained in [29]. The statistical error analysis is performed using a python implementation of the \(\Gamma \)-method [30] (exploiting information from the autocorrelation function) with automatic differentiation [31].

Table 4 Summary of simulation parameters: the first column (ID) labels our gauge configuration ensembles, the second column lists the lattice sizes \(L^3 \times T/a^4\), the third one the inverse gauge couplings \(\beta \), the fourth the Wilson hopping parameters \(\kappa \), the fifth shows the total number of molecular dynamics units MDU, the sixth the autocorrelation time of the slowest mode \(\tau _{{\mathrm {exp}}}\), and the last one the corresponding lattice spacing a, estimated from Ref. [19]

The gauge ensembles used in this study are detailed in Table 4. They coincide with the ones used in [23] but for the ensemble C1k1. These are essentially the ensembles used in [15, 29] plus the ensembles A1k3, A1k4, B1k4, C1k1, D1k2 and D1k4, which were added to improve the chiral fits. For the two ensembles E1k1 and E1k2 the number of molecular dynamics units was increased by factor of more than 4. The ensembles with volume \(L^3\times T\) described above are designed to lie on a line of constant physics (LCP), where the spatial extent of \(L \approx 1.2\,\mathrm{fm}\) and \(T/L\approx 3/2\) are almost constant. The Ward identity conditions which fix the ratio \(Z_{{\mathrm {S}}}/(Z_{{\mathrm {P}}} Z_{{\mathrm {A}}})\) are imposed at constant physics, i.e., we require that all length scales in the correlation functions, which define a given condition formulated through one of the foregoing Ward identities, are kept fixed in physical units. Once this requirement is satisfied, only the lattice spacing a changes as \(g_0\) is varied. Consequently, renormalisation constants (as well as their ratios) extracted from different constant physics conditions are expected to rapidly approach an almost unique function of \(g_0\) as \(g_0\rightarrow 0\). For a more general discussion of the constant physics idea in a similar context see, e.g., Ref. [32].

The initial tuning of this LCP was done based on the (universal) 2-loop beta-function as explained in Ref. [29]. Thus the volume of the lattices varies by \(\approx 10\)% over the range of couplings considered. However, using the results of Ref. [19], we verified that this deviation is proportional to the lattice spacing a and thus contributes to our quantity of interest only as a higher-order ambiguity.Footnote 8

The simulations in this work suffer from critical slowing down of the topological charge for smaller lattice spacings. This phenomenon, often dubbed “topology freezing”, could give unreliable results due to an insufficient sampling of topological sectors. We circumvent this problem by reweighting all data to the trivial topological sector \(Q = 0\) at the cost of decreasing the effective number of configurations; see [29, 34] for a discussion. Furthermore we increase the statistical uncertainties by attaching a tail to the integrated autocorrelation functions as proposed in [35]. As measure for \(\tau _{{\mathrm {exp}}}\), the autocorrelation time of the slowest mode in the simulation, we use the integrated autocorrelation time of the squared topological charge \(Q^2\) extracted from the longest Monte Carlo chain for each value of \(\beta \). The \(\tau _{{\mathrm {exp}}}\)-values for the individual ensembles can be found in Table 4.

In order to solve the Ward identity for \(Z_{{\mathrm {S}}}/Z_{{\mathrm {P}}}\) we need non-perturbative knowledge of the non-singlet axial current renormalisation constant \(Z_{\mathrm {A}}\) and the \({\mathrm {O}}(a)\) improvement coefficient \(c_{\mathrm {A}}\). The constant \(Z_{\mathrm {A}}\) was calculated on a subset of the gauge configurations in this work, Ref. [15], as well as in the chirally rotated Schrödinger functional, Ref. [22], which is a completely different determination. We prefer the results from the latter because of their smaller statistical uncertainties. The errors of \(Z_{\mathrm {A}}\) are accounted for in quadrature when solving for \(Z_{\mathrm {S}}/Z_{\mathrm {P}}\) in our Ward identity expressions. For \(c_{\mathrm {A}}\) we use the results of [29], without error, following standard practice.

In principle the ratio we would like to determine, as well as all correlation functions involved, depend on the \({\mathrm {O}}(a)\) improved coupling \(\tilde{g}_0^2=g_0^2[1+ab_{\mathrm {g}}{\mathrm {tr}}\,M_{\mathrm {q}}/N_{\mathrm {f}}]\), where the coefficient \(b_{\mathrm {g}}\) is only known at 1-loop perturbation theory [2]. This issue is of no relevance here, as all normalisation conditions are imposed at zero quark mass. However, this should be kept in mind when using results obtained here in a different setting with non-vanishing sea quark masses.

In order to study the scaling behaviour of some of our results, we need the lattice spacings in physical units at the bare couplings used in this work. In Ref. [19], such values are provided for couplings close to those in Table 4; these enable us to extract the lattice spacings at our gauge couplings using a polynomial interpolation.

As additional cross checks we investigate the non-perturbative validity of the identities (24), (27) and (31). The results can be found in Appendix E.

4.1 Chiral extrapolation

From the plethora of possible renormalisation conditions listed in Sect. 3, we single out a class labeled WI(1468) to which only quark connected diagrams contribute and for which the statistical precision is best. We detail the analysis for this specific choice, but the same steps also apply to any other identity discussed in the following.

Fig. 3
figure 3

Comparison of the chiral extrapolation for WI(1468) at \(\beta =3.676\) with and without the term proportional to the mass. In the massless case the data cannot be described by a linear function in am for the full mass range. The dotted line visualises the chiral extrapolation of the massless data set excluding the outmost data point. When the mass term is included, the data shows no significant quark mass dependence. The slope of the linear fit function, shown as the dashed line, where the shaded area corresponds to the 1\(\sigma \) uncertainty, is zero within error

In order to obtain \(Z_{{\mathrm {S}}}/Z_{{\mathrm {P}}}\) at vanishing quark mass, we extra- or interpolate the data at fixed bare coupling to the chiral point. For this procedure we employ the \({\mathrm {O}}(a)\) improved PCAC mass, which we average over the central third of the temporal extent of the lattice, similarly to what was done in Ref. [23]. This choice keeps the plateau length approximately constant in physical units. For the insertion times in the master Eq. (23), we chose \(y_0=T/2\) and \(t=T/6\) rounded up to the closest integer.Footnote 9 The idea behind this choice is to place the operators as far away from the temporal boundaries as possible, so as to suppress boundary induced cutoff effects, while keeping the individual operators apart from each other, thus avoiding contact terms.

In Fig. 3 we show the chiral extrapolation of our preferred determination WI(1468), at \(\beta =3.676\), where quark masses cover a large range in lattice units. We compare results obtained from the Ward identity with and without the mass term [i.e., the term with two pseudoscalar insertions in Eq. (11)]. We see that in the “massive” case our results display a linear behaviour in the whole mass range. In addition statistical uncertainties are smaller and the data show an almost flat dependence on am, resulting to a more reliable chiral extrapolation. Therefore, we obtain \(Z_{\mathrm {S}}/Z_{\mathrm {P}}\) in the chiral limit by fitting linearly the results of the “massive” case. For this fit we employ orthogonal distance regression [36] which takes into account not only errors in the dependent, but also in the independent variable. The error obtained from this procedure for the chirally extrapolated \(Z_{\mathrm {S}}/Z_{\mathrm {P}}\) is in general larger compared to the one obtained from a standard least squares fit. Results for the individual ensembles as well as the chiral extrapolations are summarised in Table 5, which will be discussed in Sect. 4.2.

Table 5 Summary of results for am and \(Z_{{\mathrm {S}}}/Z_{{\mathrm {P}}}\) from different Ward identity determinations, labelled by WI(abcd). The Ward identity linear combinations \({\mathrm{L}}_1\) and \({\mathrm{L}}_2\) are defined in Eqs. (32) and (33). In all Ward identities the mass terms with two pseudoscalar insertions in the bulk have been included; cf. eq. (11). The errors quoted for the individual ensembles are statistical; the uncertainty on the values at the chiral point stem from the orthogonal distance regression procedure of Ref. [36]

4.2 Scaling

Fig. 4
figure 4

Dependence of \(Z_{\mathrm {S}}/Z_{\mathrm {P}}\) on the gauge coupling \(g_0^2\). Results are obtained from the 11 Ward identity classes listed in Table 3. Open symbols are used for the Ward identity classes with connected-quark diagrams only; closed symbols denote Ward identity classes with both connected- and disconnected-quark diagrams. Closely related Ward identities (which are separated by a single horizontal line in Table 2) are shown with the same symbol. Data from WI(1144) are shown at their exact abscissa position, while the others have been slightly displaced in the \(g_0^2\)-direction, in order to improve visibility

In Table 3 we have listed 11 classes of distinct Ward identities; each of them is a different relation between correlation function differences \(\varDelta _k\) \((k=1,3,5,7,8)\) and \(F_{{\mathrm {S}};1}\), from which \(Z_{\mathrm {S}}/Z_{\mathrm {P}}\) may be obtained. In Fig. 4 we show these determinations in the chiral limit as functions of the gauge coupling \(g_0^2\). It is evident, as argued in Sect. 3, that there are very strong correlations between results obtained on the same configuration ensembles from “similar” Ward identity classes, as grouped in Table 2.

We are thus led to select, from the plethora of Ward identities, four representative determinations of \(Z_{\mathrm {S}}/Z_{\mathrm {P}}\). Two of these involve only quark connected diagrams. These are WI(1245) and the linear combination \({\mathrm {L}}_1\), leading to Eq. (32). The other two determinations involve both quark connected and disconnected diagrams and are therefore numerically more challenging. Here we chose WI(4488), and the linear combination \({\mathrm {L}}_2\), leading to Eq. (33). The results for each ensemble and in the chiral limit are shown in Table 5.

To evaluate the relative cutoff effects among our different results, we form ratios of \(Z_{\mathrm {S}}/Z_{\mathrm {P}}\), obtained from each of the four determinations described above, to \(Z_{\mathrm {S}}/Z_{\mathrm {P}}\) from our preferred identity WI(1468). We investigate the lattice spacing dependence of each of these four ratios which, in our Symanzik-improved setup, consists of powers of \(a^2\) and higher. The ratios are known to tend to unity in the continuum limit. We therefore fit them with polynomials in the lattice spacing, constrained to be 1 at the origin. Results are displayed in Fig. 5. The top panel of the figure displays results from the first two determinations, without quark disconnected contributions.

Fig. 5
figure 5

Lattice spacing dependence of the ratio of different \(Z_{{\mathrm {S}}}/Z_{{\mathrm {P}}}\) determinations to \(Z_{{\mathrm {S}}}/Z_{{\mathrm {P}}}\) from WI(1468). The top panel depicts results from Ward identities which involve quark-connected diagrams only, while the bottom panel shows results from Ward identities which also involve quark-disconnected diagrams

The deviations from 1 in the ratio WI(1245)/WI(1468) are very mild and can be described by a single term quadratic in the lattice spacing with \(\chi ^2/\text {d.o.f}=0.474\). For the ratio \({\mathrm {L}}_1/\)WI(1468) the deviation from 1 as well as the statistical uncertainties are larger. A glance at Fig. 5 should convince the reader that the data cannot be described by a single-parameter fit with a quadratic term. Fitting with \(1 + c_2 a^2 + c_3 a^3\) results to \(c_2 = -9.3(4.3)\), \(c_3 = 303 (71)\) and \(\chi ^2/\text {d.o.f}=0.138\). A one-parameter fit with a term proportional to \(a^3\) gives \(c_3 = 169(22)\) with \(\chi ^2/\text {d.o.f}=0.775\); this is the curve shown in Fig. 5. The bottom panel of Fig. 5 displays results from the determinations with quark disconnected contributions. Again it is obvious that none of the data displays a pure \(a^2\)-dependence. Fitting the ratio WI(4488)/WI(1468) with \(1 + c_2 a^2 + c_3 a^3\) results to \(c_2 = -26(28)\), \(c_3 = 911 (410)\) and \(\chi ^2/\text {d.o.f}=0.494\); note that \(c_2\) is compatible with zero. Fitting by \(1 + c_3 a^3\) gives \(c_3 = 567(131)\) and \(\chi ^2/\text {d.o.f}=0.511\); this is the fit shown in the Figure. For the ratio \({\mathrm {L}}_2/\)WI(1468) we again fit with two parameters, one quadratic and one cubic in the lattice spacing, obtaining \(c_2 = -7.8(4.6)\), \(c_3 = 211(68)\) and \(\chi ^2/\text {d.o.f}=1.719\). The relatively large value for \(\chi ^2/\text {d.o.f}\) can be traced to the data point at the coarsest lattice spacing. All four cases conform with the theoretical expectation of \({\mathrm {O}}(a^2)\) ambiguities or higher. We did not find any evidence for \({\mathrm {O}}(a)\) cutoff effects; trying to fit an additional term proportional to a gives coefficients which are zero within errors.

4.3 Interpolation formula

To facilitate the use of our \(Z_{\mathrm {S}}/Z_{\mathrm {P}}\) results in large volume simulations, we provide an interpolation formula for lattice spacings \(0.04\,\)fm\(\,\lesssim a \lesssim 0.1\,\)fm. Having tried several fit ansätze, we opt for a Padé interpolation constrained by the 1-loop value [37] of the form

$$\begin{aligned} \bigg (\frac{Z_{\mathrm {S}}}{Z_{\mathrm {P}}}\bigg )(g_0^2)&=1+0.020164\,g_0^2\times \frac{1-Z_{\mathrm {SP}}^{(0)}g_0^2+Z_{\mathrm {SP}}^{(1)} g_0^4}{1-Z_{\mathrm {SP}}^{(2)}g_0^2}\,, \end{aligned}$$
(34a)
$$\begin{aligned} Z_{\mathrm {SP}}^{(0)}&=-0.5357\,,\quad Z_{\mathrm {SP}}^{(1)}=0.2883\,,\quad Z_{\mathrm {SP}}^{(2)}=-0.5117\,, \end{aligned}$$
(34b)

with the covariance matrix

$$\begin{aligned}&{\mathrm {cov}}(Z_{{\mathrm {SP}}}^{(i)},Z_{{\mathrm {SP}}}^{(j)}) \nonumber \\&\quad =\left( \begin{array}{ccc} +2.0195e-01 &{}\quad -1.3844e-01 &{}\quad -4.1248e-03 \\ -1.3844e-01 &{}\quad +9.5121e-02 &{}\quad +2.8754e-03 \\ -4.1248e-03 &{}\quad +2.8754e-03 &{}\quad +9.6128e-05\\ \end{array}\right) , \end{aligned}$$
(34c)

and \(\chi ^2/\text {d.o.f.}=0.169\).

As the functional form in the non-perturbative coupling region is in principle unknown, we investigated the significance of systematic effects by also experimenting with alternative forms of interpolating functions (such as higher-order Padés, exponentials and polynomials), constrained to monotonically approach the 1-loop perturbation theory result. However, among those describing our results reliably (as signaled by an acceptable \(\chi ^2/\text {d.o.f.}\)) practically coincide with the interpolation (34) in the fitted range of couplings, so that the associated systematic errors are negligible compared to the statistical ones. Therefore, we only account for systematic uncertainties when extrapolating with Eq. (34) to values slightly outside the fitted range by adding a systematic error of 50% of the size of the statistical one in quadrature. This prescription is applied at \(\beta =3.85\), which corresponds to the finest lattice spacing simulated by the CLS effort.

The WI(1468) results with the interpolation are shown in Fig. 6, where they are also compared to the prediction of 1-loop perturbation theory. The vertical dashed lines mark the bare couplings used in CLS simulations, to which we want to interpolate our results. Results for \(Z_{\mathrm {S}}/Z_{\mathrm {P}}\) at the \(g_0^2\)-values used in \(N_{\mathrm {f}} = 2 + 1\) CLS simulations are given in Table 6.

Fig. 6
figure 6

\(Z_{\mathrm {S}}/Z_{\mathrm {P}}\) results from WI(1468), extrapolated to the chiral point, plotted against the bare gauge coupling \(g_0^2\). The Padé interpolation formula (34), shown with errorband, is used to propagate the statistical uncertainty. The 1-loop perturbative result from Ref. [37] is shown for comparison. The vertical dashed lines indicate the CLS couplings of Refs. [18, 20, 21]

Table 6 \(Z_{\mathrm {S}}/Z_{\mathrm {P}}\) results from WI(1468) (second column) and from Ref. [23] for two lines of constant physics (LCP), specified there. The inverse gauge couplings \(\beta \) are those used in \(N_{\mathrm {f}} = 2 + 1\) CLS simulations [18, 20, 21]. The error of the WI(1468) results is the statistical uncertainty propagated from the interpolation formula (34) except for \(\beta =3.85\) where we added a systematic uncertainty, 50% of the size of the statistical one, in quadrature. For the results of the two LCP columns we combine the errors of Z (from Ref. [23]) and \(Z_{\mathrm {A}}\) (from Ref. [22]) in quadrature

4.4 Comparison with previous works

We are not aware of any direct determinations of \(Z_{\mathrm {S}}/Z_{\mathrm {P}}\) in our specific setup, but we can compare our findings, using existing results for the quark mass renormalisation constant \(Z \equiv Z_{\mathrm {P}}/(Z_{\mathrm {S}} Z_{\mathrm {A}})\). The idea is to compute \(Z_{\mathrm {S}}/Z_{\mathrm {P}}=(Z Z_{\mathrm {A}})^{-1}\), with Z from either Ref. [20] or Ref. [23], and \(Z_{\mathrm {A}}\) from Ref. [22]. In Ref. [20], Z has been computed on large-volume CLS ensembles, from the relation between PCAC quark masses \(m_{ij}\) and subtracted quark masses \(m_{{\mathrm {q}},ij}\) (see Sect. 5 and Appendix C for these mass definitions). The Z-results in Ref. [23] were obtained on almost the same gauge ensembles used in this workFootnote 10 at small volumes and nearly-chiral sea quark masses. The method of Ref. [23] is based on suitable combinations of renormalised quark masses, defined both through the PCAC relation and the subtracted bare mass, evaluated in the \({\mathrm {O}}(a)\) improved theory with non-degenerate valence quarks, including all necessary counterterms. Results are quoted for two different lines of constant physics labeled LCP-0 and LCP-1, which differ by the values at which the quark masses in the valence sector are kept fixed as \(g_0\) is varied.

We compute the ratio of \(1/(Z Z_{\mathrm {A}})\) from Refs. [20] and [23] to \(Z_{\mathrm {S}}/Z_{\mathrm {P}}\) from our preferred WI(1468). We investigate the lattice spacing dependence of this ratio, which consists of powers of \(a^2\) and higher, and tends to unity in the continuum limit. The results are plotted in Fig. 7. Polynomial fits are performed on the LCP-0 and LCP-1 ratios, excluding the data of the coarsest ensembles, which display poor scaling behaviour and large errors. A two-parameter fit of the form \(1+ c_2 a^2 + c_3 a^3\) results to \(\chi ^2/\text {d.o.f} = 0.281\), \(c_2=-2.5(3.7)\) and \(c_3=242(57)\) for LCP-0, and \(\chi ^2/\text {d.o.f} = 0.166\), \(c_2=1.5(2.8)\) and \(c_3=148(45)\) for LCP-1, in both cases \(c_2\) is consistent with zero. We thus prefer to plot the results as functions of \(a^3\) in Fig. 7, where we also show a one-parameter fit of the form \(1+ c_3 a^3\); for this ansatz we obtain \(\chi ^2/\text {d.o.f} = 0.300\), \(c_3 = 206(14)\) for LCP-0 and \(\chi ^2/\text {d.o.f} = 0.170\), \(c_3 = 169(12)\) for LCP-1.Footnote 11 We interpret this as confirmation that the two methods are compatible w.r.t. the expected lattice spacing ambiguities and that the effects of \(\mathrm{O}(a^2)\) are sub-dominant compared to the next higher order.

Fig. 7
figure 7

Scaling behaviour of the ratio of \(Z_{\mathrm {S}}/Z_{\mathrm {P}}\) deduced from results in Refs. [20, 23] to \(Z_{\mathrm {S}}/Z_{\mathrm {P}}\) from WI(1468)

Let us briefly comment on the possible benefits of the respective results on \(Z_{\mathrm {S}}/Z_{\mathrm {P}}\) collected in Table 6, originating from the different approaches underlying Ref. [23] and this work. First, one observes comparable uncertainties between the two. While the method of that reference involves combinations of simpler and thus typically less noisy correlation functions (i.e., with only one operator insertion in the bulk) as well as an accurate computation of the valence quark mass dependence prior to the chiral extrapolations, our estimates on \(Z_{\mathrm {S}}/Z_{\mathrm {P}}\) from the more direct Ward identity approach followed here exhibit an overall flatter and, at larger couplings, less steep \(g_0^2\)-dependence. This points to generically smaller cutoff effects so that continuum extrapolations of quantities where it enters may be expected to become better controlled and more precise in the long run, because they are also less affected by unpleasantly significant admixtures of higher-order cutoff effects.

The results for Z presented in [20], stemming from large-volume calculations on a subset of the CLS ensembles, are only available at two values of the bare coupling, which do not coincide with the couplings investigated in this work. In order to compare with our results we make use of the interpolation formula Eq. (34). Although the estimates for Z from Ref. [20] are only available at two values of the bare coupling and we hence do not attempt a fit in this case, we notice that they are compatible with LCP-0.

In summary, comparison with earlier works is consistent with the expectation that all ambiguities between different determinations of \(Z_{\mathrm {S}}/Z_{\mathrm {P}}\) show a scaling according to \(\mathrm{O}(a^2)\) or higher. However, the size of these ambiguities is quite large and may still have a relevant impact on applications as described in the next Section.

5 Application: quark mass computations with Wilson fermions

We will now discuss a method of computing quark masses with Wilson fermions which uses the ratio \(Z_{\mathrm {S}}/Z_{\mathrm {P}}\).

First we review the well-established “PCAC quark mass method”. It is the conventional ALPHA Collaboration approach, which relies on the PCAC definition of quark masses \(m_{ij}\) of Eq. (C.7). These bare current masses are computed on large physical volumesFootnote 12 and for a range of couplings typical of hadronic, low-energy scales \(\mu _{{\mathrm {had}}} \sim \Lambda _{{\mathrm {QCD}}}\). Although we keep our notation as general as possible, for concreteness we consider a theory with \(N_{\mathrm{f}}= 2+1\) dynamical fermions; i.e. the two lightest flavours are degenerate in mass while the third flavour is heavier (\(m_{\mathrm{q,1}} = m_{\mathrm{q,2}} < m_{\mathrm{q,3}}\)).

We see from Eq. (C.8) that the renormalised light mass is given by

$$\begin{aligned} m_{1,{\mathrm{R}}}&= m_{2,{\mathrm{R}}} = \dfrac{Z_{\mathrm {A}}}{Z_{\mathrm {P}}} \,\, m_{12} \, \nonumber \\&\quad \times \Big [ 1 \, + (b_{\mathrm{A}}-b_{\mathrm{P}}) a m_{{\mathrm {q}},12} \, + \, (\bar{b}_{\mathrm {A}}- \bar{b}_{\mathrm {P}})a\ \hbox {Tr}(M_{\mathrm {q}}) \Big ] + \mathrm{O}(a^2) \,. \end{aligned}$$
(35)

The ratio of the heavy to light renormalised masses is also derived from the above expression:

$$\begin{aligned}&\dfrac{m_{3,{\mathrm{R}}}}{m_{1,{\mathrm{R}}}} \nonumber \\&\quad = 2 \, \dfrac{m_{13}}{m_{12}} \, \Big [ 1 + (b_{\mathrm{A}}- b_{\mathrm{P}}) \dfrac{(am_{{\mathrm{q,3}}} - am_{\mathrm{q,2}} )}{2} \, \Big ] - 1 + \mathrm{O}(a^2) \,. \end{aligned}$$
(36)

Knowing the renormalised light mass from Eq. (35), and the ratio of the heavy and light renormalised masses from Eq. (36), the up/down and strange masses are obtained [19, 25]. So in principle this method requires:

  1. 1.

    The axial current normalisation \(Z_{\mathrm {A}}(g_0^2)\) and the renormalisation constant \(Z_{\mathrm {P}}(g_0^2,\mu _{{\mathrm {had}}})\) of the non-singlet pseudoscalar density; the latter carries the renormalisation scheme and scale dependence of the continuum quark mass. In our \(N_{\mathrm{f}}=3\) setup, these may be found in Refs. [22] and [42], respectively.

  2. 2.

    The Symanzik-improvement coefficients \((b_{\mathrm{A}}- b_{\mathrm{P}})\) and \((\bar{b}_{{\mathrm {A}}} - \bar{b}_{{\mathrm {P}}})\). Non-perturbative \((b_{\mathrm{A}}- b_{\mathrm{P}})\)-estimates in our setup may be found in Ref. [23]. Note that in perturbation theory \((\bar{b}_{{\mathrm {A}}} - \bar{b}_{{\mathrm {P}}}) \sim {\mathrm{O}}(g_0^4)\), so that the term proportional to this coefficient is habitually dropped.

  3. 3.

    It is also noteworthy that Eq. (36) does not require knowledge of \(\kappa _{{\mathrm {crit}}}\), which is however needed in \(m_{{\mathrm {q}},12}\) and \(\hbox {Tr}(M_{{\mathrm {q}}})\) in Eq. (35). We shall return to this point in Sect. 5.1.

Based on the results of Ref. [43] for Symanzik-improved quark masses with Wilson fermions, an alternative approach, known as the “ratio-difference method”, has been proposed in Ref. [24]. The renormalised quark mass difference is given by

$$\begin{aligned} \begin{aligned}&m_{3,\mathrm{R}} - m_{1,\mathrm{R}} = Z_{{\mathrm {S}}}^{-1} \Big [ m_{{\mathrm {q}},3} - m_{{\mathrm {q}},1} \Big ] \,\, \\&\quad \times \Big [ 1 + a 2 b_m \, m_{{\mathrm {q}},13} + a \bar{b}_m\ \hbox {Tr}(M_{\mathrm {q}}) \Big ] + \mathrm{O}(a^2) \,. \end{aligned} \end{aligned}$$
(37)

Knowing the renormalised mass difference from Eq. (37), and the ratio of the heavy and light renormalised masses from Eq. (36), the up/down and strange masses are obtained. So in principle this method requires:

  1. 1.

    The renormalisation constant \(Z_{\mathrm {S}}(g_0^2,\mu _{{\mathrm {had}}})\) of the non-singlet scalar density, which carries the renormalisation scheme and scale dependence of the continuum quark mass.

  2. 2.

    The Symanzik-improvement coefficients \((b_{\mathrm{A}}- b_{\mathrm{P}})\), \(b_m\) and \(\bar{b}_m\). Non-perturbative estimates of the \(b_m\)-coefficient in this setup may be found in Ref. [23].Footnote 13 Since \(\bar{b}_m \sim {\mathrm{O}}(g_0^4)\), the term proportional to \(\hbox {Tr}(M_{\mathrm {q}})\) is habitually dropped.

  3. 3.

    The critical hopping parameter \(\kappa _{\mathrm {crit}}\) is needed in \(m_{{\mathrm {q}},13}\) and \(\hbox {Tr}(M_{\mathrm {q}})\) in Eq. (37). We shall return to this point in Sect. 5.1.

We have outlined the basic idea behind the PCAC quark mass method and the ratio-difference method, listing the renormalisation parameters and improvement coefficients required by each one. The most crucial difference is that in the PCAC quark mass method all bare masses are given in terms of the current masses \(m_{12}\) and \(m_{13}\), which are renormalised by \(Z_{\mathrm {P}}^{-1} Z_{\mathrm {A}}\), while in the ratio-difference method the bare mass difference is the exactly known \([m_{{\mathrm {q}},3} - m_{{\mathrm {q}},1}]\), which is renormalised by \(Z_{\mathrm {S}}^{-1}\). It is not possible to determine \(Z_{{\mathrm {S}}}\) with a Schrödinger functional renormalisation condition analogous to that introduced in Ref. [44] for \(Z_{{\mathrm {P}}}\). The latter involves correlation functions with a pseudoscalar source at the boundary (see Eq. (A.9)) and the pseudoscalar scalar operator at the bulk. If we place a scalar operator at the bulk, keeping the pseudoscalar boundary source, the correlation function vanishes due to parity. Nor is it possible to have a scalar source at the boundary and the scalar density at the bulk, since this would result in the product \(P_+ P_-\) of the projection operators of the boundary quarks and the vanishing of the correlation function. An option would be to impose a renormalisation condition on the correlation function \(\langle {\mathcal {O}}^{\prime a} \, S^b(x) \, {\mathcal {O}}^c \rangle \), with the two pseudoscalar boundary sources \({\mathcal {O}}^{\prime a}\) and \({\mathcal {O}}^c\) and the scalar operator \(S^b\) in the bulk. This would be an acceptable intermediate scheme of the Schrödinger functional variety, but different than the one introduced in Ref. [44] for \(Z_{{\mathrm {P}}}\). Thus, the renormalised quark masses \(m_{1{\mathrm{R}}},m_{3{\mathrm{R}}}\) obtained by combining Eqs. (35) and (36) (PCAC quark mass method with \(Z_{{\mathrm {P}}}\)) would be in a different scheme than those obtained from Eqs. (37) and (36) (difference-ratio method with \(Z_{{\mathrm {S}}}\)). Only results obtained for the scheme-independent renormalisation group invariant (RGI) masses from the two methods would be comparable. This comparison would be very useful but cumbersome, as it requires the computation from scratch of the step scaling function in the new intermediate scheme, from ratios of \(Z_{{\mathrm {S}}}\)’s at fixed renormalised coupling and two different renormalisation scales, and for a range of couplings.

Given the above considerations, we are led to define the scalar operator renormalisation parameter through:

$$\begin{aligned} Z_{{\mathrm {S}}}(g_0^2,\mu _{{\mathrm {had}}}) = \Bigg [ \dfrac{Z_{{\mathrm {S}}}(g_0^2, \mu _{{\mathrm {had}}})}{Z_{{\mathrm {P}}}(g_0^2,\mu _{{\mathrm {had}}})} \Bigg ] Z_{{\mathrm {P}}}(g_0^2,\mu _{{\mathrm {had}}}) \,. \end{aligned}$$
(38)

This is our definition of the Schrödinger functional renormalisation scheme for the scalar non-singlet operator. The \(Z_{{\mathrm {S}}}/Z_{{\mathrm {P}}}\)-ratio on the r.h.s. is scale independent, being determined from Ward identities. Clearly, scalar and pseudoscalar densities have the same renormalisation group running properties (i.e., the same anomalous dimensions, the same step scaling functions in the continuum, etc.). So knowledge of the \(Z_{{\mathrm {S}}}/Z_{{\mathrm {P}}}\) ratio enables us to obtain the light and heavy quark masses in the usual Schrödinger functional scheme [44], but with a different method based on mass differences (and \(Z_{{\mathrm {S}}}\)) combined with scale-independent PCAC mass ratios. The novel renormalisation and improvement patterns provide an important handle for the control and reduction of systematic effects related to the non-perturbative determination of renormalisation parameters and discretisation errors.Footnote 14 What is common in both methods is the renormalisation group running that takes us non-perturbatively from renormalised masses at low energy scales \(\mu _{{\mathrm {had}}}\) to masses at large, perturbative scales \(\mu _{\mathrm {PT}} \sim M_{{\mathrm {W}}}\), as described in Ref. [44]. For recent results on the running of quark masses in \(N_{\mathrm{f}}= 3\) QCD see Ref. [42].

5.1 Subtracted masses, PCAC masses, and redefined Symanzik counterterms

We will close this section by reviewing how, in both methods, we can circumvent the need to use \(\kappa _{\mathrm {crit}}\) in the Symanzik counterterms of Eqs. (35) and (37), which feature subtracted masses \(am_{{\mathrm {q}},ij}\) and \(\mathrm{Tr}[aM_{\mathrm{q}}]\). This can be avoided by substituting these subtracted masses with current quark masses. Their relation is given by [43],

$$\begin{aligned} m_{ij} = Z \bigg [ m_{{\mathrm {q}},ij} + \left( r_m - 1 \right) \dfrac{\mathrm{Tr}[M_{\mathrm{q}}]}{N_{\mathrm{f}}} \bigg ] + \mathrm{O}(a) \,, \end{aligned}$$
(39)

where \(Z(g_0^2)\equiv Z_{{\mathrm {P}}}/(Z_{{\mathrm {S}}} Z_{{\mathrm {A}}})\) and \(r_m \equiv Z_{{\mathrm {S}}}/Z_{{\mathrm {S}}^0}\) are finite normalisations (\(Z_{{\mathrm {S}}^0}\) is the renormalisation parameter of the singlet scalar density). In the above we neglect \(\mathrm{O}(a)\) terms, as they only contribute to \(\mathrm{O}(a^2)\) in the b-counterterms of Eqs. (35) and (37). Substituting \(am_{{\mathrm {q}},ij}\rightarrow am_{ij}\) in these expressions, we obtain respectively

$$\begin{aligned} \begin{aligned} m_{1,{\mathrm{R}}}&= m_{2{\mathrm{R}}} = \dfrac{Z_{{\mathrm {A}}}}{Z_{{\mathrm {P}}}} \, m_{12} \,\, \\&\quad \times \Bigg [ 1 + (\tilde{b}_{\mathrm {A}}-\tilde{b}_{\mathrm {P}}) am_{12} + \Bigg \{ (\tilde{b}_{\mathrm {A}}- \tilde{b}_{\mathrm {P}}) \dfrac{1-r_m}{r_m} \\&\quad + (\bar{b}_{\mathrm {A}} - \bar{b}_{\mathrm {P}}) \dfrac{N_{\mathrm{f}}}{Z r_m} \Bigg \} \dfrac{a M_{\mathrm{sum}}}{N_{\mathrm{f}}} \Bigg ] + \mathrm{O}(a^2) \,, \end{aligned} \end{aligned}$$
(40)

and

$$\begin{aligned}&m_{3,\mathrm{R}} - m_{1,\mathrm{R}} = Z_{{\mathrm {S}}}^{-1} \Big [ m_{{\mathrm {q}},3} - m_{{\mathrm {q}},1} \Big ] \,\, \nonumber \\&\quad \times \Bigg [ 1 + 2 \tilde{b}_m \, a m_{13} \nonumber \\&\quad + \Bigg \{ 2 \tilde{b}_m \dfrac{1-r_m}{r_m} + \bar{b}_m \dfrac{N_{\mathrm{f}}}{Z r_m} \Bigg \} \dfrac{a M_{\mathrm{sum}}}{N_{\mathrm{f}}} \Bigg ] + \mathrm{O}(a^2) \,, \end{aligned}$$
(41)

where we define

$$\begin{aligned}&\tilde{b}_{\mathrm {A}}- \tilde{b}_{\mathrm {P}} \equiv \dfrac{b_{\mathrm{A}}-b_{\mathrm{P}}}{Z} \,, \qquad \tilde{b}_m \equiv \dfrac{b_m}{Z} \,, \end{aligned}$$
(42)
$$\begin{aligned}&M_{\mathrm{sum}} \equiv m_{12} + m_{23} + \cdots + m_{(N_{\mathrm{f}}-1)N_{\mathrm{f}}} + m_{N_{\mathrm{f}}1} \nonumber \\&\quad = Z r_m \mathrm{Tr}[M_{\mathrm{q}}] + \mathrm{O}(a) \,. \end{aligned}$$
(43)

Thus, \(am_{{\mathrm {q}},ij}\) and \(\kappa _{{\mathrm {crit}}}\) in Eqs. (35) and (37) have been traded off for \(m_{ij}\), Z, and \(r_m\). Accurate non-perturbative estimates of Z, \((b_{\mathrm{A}}-b_{\mathrm{P}})\), and \(b_m\) in our \(N_{\mathrm{f}}=3\) setup have been reported in Ref. [23]. The term multiplying \(M_{\mathrm{sum}}\) contains \((1-r_{m})/r_m\) and \((\bar{b}_{{\mathrm {A}}} - \bar{b}_{{\mathrm {P}}})\). To leading order in perturbation theory \(r_m = 1 + 0.001158\,C_{\mathrm{F}}\,N_{\mathrm{f}}\,g_0^4\) [20, 45]; thus \((1-r_m)/r_m \sim \mathrm{O}(g_0^4)\). A first non-perturbative study of the coefficients \(\bar{b}_{\mathrm {A}}\), \(\bar{b}_{\mathrm {P}}\), and \(\bar{b}_m\) produced noisy results with 100% errors [46]. Since in perturbation theory \((\bar{b}_{\mathrm {A}}- \bar{b}_{\mathrm {P}}), \bar{b}_m \sim \mathrm{O}(g_0^4)\) [43], the terms proportional to \(M_{\mathrm{sum}}\) are habitually dropped.

For completeness we also discuss a slightly different way to write the \(b_m\)-counterterm of the renormalised quark mass difference of Eq. (37), in close analogy to what is done in Ref. [24]. The term in question is written as follows:

$$\begin{aligned} \begin{aligned}&a b_m [ m_{{\mathrm {q}},3} + m_{{\mathrm {q}},1} ] = a b_m [ m_{{\mathrm {q}},3} + m_{{\mathrm {q}},1} ] \Bigg [ \dfrac{m_{{\mathrm {q}},3} - m_{{\mathrm {q}},1}}{ m_{{\mathrm {q}},3} - m_{{\mathrm {q}},1}} \Bigg ] \\&\quad = a b_m \bigg [ m_{{\mathrm {q}},3} - m_{{\mathrm {q}},1} \bigg ] \dfrac{\bigg [\dfrac{m_{33^\prime }}{m_{12}} + 1 \bigg ] + \dfrac{2(1-r_m)}{r_m} \dfrac{M_{\mathrm{sum}}}{m_{12}}}{\bigg [\dfrac{m_{33^\prime }}{m_{12}} - 1 \bigg ]} \,. \end{aligned} \end{aligned}$$
(44)

We arrive at the second expression using Eq. (39) and introducing the PCAC mass \(m_{33^\prime }\), which consists of two degenerate but distinct heavy valence flavours. Neglecting the term proportional to \(M_{\mathrm{sum}}\) in Eq. (44), we conclude that in this approximation the difference-ratio method is based on Eqs. (36) and (37), which depend on the exactly known subtracted quark mass difference \([m_{{\mathrm {q}},3} - m_{{\mathrm {q}},1}]\) and suitable PCAC quark mass ratios, but not on subtracted quark mass averages \(m_{{\mathrm {q}},ij}\) and \(\kappa _{\mathrm {crit}}\).

6 Conclusions

In the present study we have addressed, for the first time within the finite-volume Schrödinger functional setup, the non-perturbative determination of the ratio of the scalar to pseudoscalar non-singlet renormalisation constants \(Z_{\mathrm {S}}/Z_{\mathrm {P}}\) in Wilson’s lattice QCD, exploiting suitable massive chiral Ward identities. We have shown that in lattice QCD with three flavours of Wilson-Clover quarks (with non-perturbative \(c_{\mathrm{sw}}\) [47]) and tree-level Symanzik-improved gauge action, the Ward identities are restored up to \(\mathrm{O}(a^2)\) at finite lattice spacing. In order to ensure a smooth dependence of the renormalisation constant ratio on the bare gauge coupling, we have enforced a constant physics condition by working with an approximately fixed physical volume of spatial extent \(L \approx 1.2\,\mathrm{fm}\) and \(T/L \approx 3/2\).

Our main results are the parameterisation of \(Z_{\mathrm {S}}/Z_{\mathrm {P}}\) in Eq. (34), valid for bare couplings \(1.55\lesssim g_0^2\lesssim 1.85\) (i.e., lattice spacings \(0.042\,\mathrm{fm}\lesssim a\lesssim 0.105\,\mathrm{fm}\)), as well as the values for \(Z_{\mathrm {S}}/Z_{\mathrm {P}}\), given in Table 6, at the bare couplings typically employed in the large-volume \(N_{\mathrm {f}}=2+1\) CLS ensembles [18,19,20,21]. On the technical level, we had to treat properly the topology freezing encountered in our simulations, principally at the finest lattice spacing, which may prevent a trustworthy estimation of the statistical error. The operator character of Ward identities ensures their validity in sectors of fixed topological charge. Thus we have projected the correlation functions entering the Ward identities onto the trivial topological sector throughout our analysis.

Several checks have been performed, in order to guarantee the stability of the analysis and a careful assessment of the statistical as well as the systematic errors. In particular, we have verified that results on \(Z_{\mathrm {S}}/Z_{\mathrm {P}}\) from the different classes of Ward identities at our disposal are perfectly consistent with each other as expected, i.e., up to ambiguities of \({\mathrm {O}}(a^2)\) or even higher. Among the various estimators for \([Z_{\mathrm {S}}/Z_{\mathrm {P}}](g_0^2)\), our preferred choice, advocated in Eq. (34), was guided by the structural simplicity of the underlying chiral Ward identity, its numerical precision, and its robustness against systematic effects.

Since the range of couplings covered in this work matches those of the large-volume gauge field configurations generated by CLS with the same lattice action, our result for \([Z_{\mathrm {S}}/Z_{\mathrm {P}}](g_0^2)\), combined with the scale dependent renormalisation factor \(Z_{\mathrm {P}}\) from [42], can be used in the computation of quark masses as outlined in Sect. 5. Work in this direction, extending the \((2+1)\)-flavour computations of light, strange and charm quark masses on the CLS ensembles reported in refs. [25, 41], is in progress.