Ward identity determination of $Z_\mathrm{S}/Z_\mathrm{P}$ for $N_\mathrm{f}=3$ lattice QCD in a Schr\"odinger functional setup

We derive chiral Ward identities for lattice QCD with Wilson quarks and $N_\mathrm{f} \geq 3$ flavours, on small lattices with Schr\"odinger functional boundary conditions and vanishingly small quark masses. These identities relate the axial variation of the non-singlet pseudoscalar density to the scalar one, thus enabling the non-perturbative determination of the scale-independent ratio $Z_\mathrm{S}/Z_\mathrm{P}$ of the renormalisation parameters of these operators. We obtain results for $N_\mathrm{f}=3$ QCD with tree-level Symanzik-improved gluons and Wilson-Clover quarks, for bare gauge couplings which cover the typical range of large-volume $N_\mathrm{f} = 2+1$ simulations with Wilson fermions at lattice spacings below $0.1\,$fm. The precision of our results varies from 0.3\% to 1\%, except for the coarsest lattice, where it is 2\%. We discuss how the $Z_\mathrm{S}/Z_\mathrm{P}$ ratio can be used in the non-perturbative calculations of $\mathrm{O}(a)$ improved renormalised quark masses.


Introduction
Lattice QCD with Wilson fermions is a long-established regularisation. The fermionic action satisfies most desirable properties, namely strict locality, lack of fermion doublers, and preservation of flavour symmetry in a straightforward way. Well-known shortcomings are the presence of discretisation effects linear in the lattice spacing and, most importantly, the loss of chiral symmetry. The first problem is solved by applying the Symanzikimprovement programme (see for instance Ref. [1] for a review and Ref. [2] for more details). Chiral symmetry is recovered in the continuum, at the cost of having to deal with complicated renormalisation properties for most quantities of interest (cf. Ref. [3] and references therein; for a review see also Ref. [4]). A frequently cited example of these complications is the power divergence m crit ∼ 1/a, which must be subtracted from bare quark masses before they are renormalised multiplicatively. Other examples are the normalisation parameter Z A of the axial current and the ratio Z S /Z P of the non-singlet scalar and pseudoscalar density renormalisation parameters. In a regularisation scheme which respects chiral symmetry, these quantities are strictly equal to unity at finite values of the UV cutoff. With Wilson fermions these quantities are scale-independent finite functions of the gauge coupling, which tend to unity as we approach the continuum limit. In principle they are determined by requiring that chiral Ward identities at non-vanishing lattice spacing tend to their formal counter-parts in the MS-TP-  arXiv:2005.01352v2 [hep-lat] 9 Aug 2021 continuum limit. The scope of this paper is to provide a method for the determination of Z S /Z P based on Ward identities on physically small lattices with Schrödinger functional boundary conditions and realising a line of constant physics (LCP) in parameter space. Results are obtained for N f = 3 dynamical quarks.
The general idea behind using chiral Ward identities in order to evaluate Z S /Z P for Wilson fermions appeared in Ref. [3] 1 . It has been put to practice with quenched, unimproved Wilson fermions in Ref. [5] and subsequently with tree-level Symanzik-improved ones in Ref. [6]. The chiral Ward identities in question were obtained for large-volume lattices with periodic boundary conditions and non-chiral quark masses. Ratios of Z S /Z P were calculated at fixed gauge coupling for several quark masses and extrapolated to the chiral limit. A second-generation of calculations was not based on Ward identities but obtained by computing Z S and Z P in the RI/MOM scheme [7]. Again these calculations are performed at finite quark masses, followed by chiral extrapolations. A well known problem in this approach is that the Z S /Z P ratio thus obtained differs from the Ward identity one by "Goldstone pole contaminations" at the IR end of a renormalisation window. This problem was first identified in Ref. [7], and subsequently discussed in Refs. [8][9][10][11] (and reviewed in Ref. [4]), while the discussion specific to the difference between Ward identity and RI/MOM determinations of the ratio Z S /Z P is found in Ref. [10]. Although the problem is greatly attenuated by the RI/SMOM variant of this method [12], the requirement of a reliable renormalisation window is inherent in these approaches.
In the present work we revisit the Ward identity method, with an important novelty: lattices with small physical volumes and Schrödinger functional boundary conditions are used, with quark flavours degenerate in mass and (almost) at the chiral limit. In doing so, we follow closely the method introduced in Ref. [13] (and originally applied in the quenched approximation in that work) for the non-perturbative determination of the scale independent normalisation parameter Z A of the axial vector current. Updates and optimisations of these computations can be found in refs. [14,15] for twoand three-flavour QCD, respectively. Ward identities are imposed at constant physics to ensure a removal of O(a) effects in on-shell quantities and, at the same time, smoothly vanishing O(a 2 ) effects as the bare coupling is varied. It must be stressed that the chiral Ward identities adopted in these works to determine Z A are valid for N f ≥ 2 quark flavours, while the ones we introduce in the present work for the determination of Z S /(Z P Z A ) are valid for N f ≥ 3.
We note in passing that, based on the chirally rotated Schrödinger functional construction of Ref. [16], a more recent method for the non-perturbative computation of Z S /Z P has been mentioned in Ref. [17].
This paper is organised as follows: in Section 2 (Subsection 2.1) we formally derive chiral Ward identities for continuum QCD, which relate correlation functions of non-singlet pseudoscalar and scalar composite operators (densities). The former are correlation functions with two operator insertions at two distinct space-time points (an axial current and a pseudoscalar density) in the presence of a generic external source operator. The latter involve a single insertion of the scalar operator. Subsequently (Subsection 2.2), we rewrite the same Ward identities in the lattice-regularised QCD with Wilson fermions. The external source consists of two standard Schrödinger functional boundary sources, each placed at a temporal boundary. The loss of chiral symmetry by Wilson fermions is taken into account by the renormalisation constants Z P and Z S of the pseudoscalar and scalar densities and the normalisation of the axial current, Z A . In the chiral limit, these Ward identities hold up to O(a 2 ) discretisation effects. We also discuss the corrections arising in practical simulations, which slightly deviate from the chiral limit; these are O(am, a 2 ). Finally, in Subsection 2.3 we re-express these Ward identities in terms of traces of valence quark propagators, which multiply factored-out traces of generators of the SU (N f ) flavour group. Section 3 takes an even closer look at these Ward identities. We distinguish several equivalence classes, each consisting of identities with different flavour structure, which reduce to the same relations between correlation functions, giving the same Z S /(Z P Z A ) result. Ward identities belonging to different equivalence classes provide Z S /(Z P Z A ) estimates which differ by O(am, a 2 ) effects. If we neglect these effects, we can combine identities from different equivalence classes, ending up with new relations between correlation functions (true up to O(am, a 2 ) errors). Thus we can explore to what extent different equivalence classes provide independent estimates of Z S /(Z P Z A ). Some of these estimates are expected to be noisier than others, as they are obtained using both quark-connected and quark-disconnected correlation functions.
In Section 4 we present our results for QCD with N f = 3 dynamical flavours, where the lattice gauge action is tree-level Symanzik-improved and the fermion action is non-perturbatively Wilson-Clover improved. Our simulations are performed with degenerate mass flavours lying close to the chiral limit. The non-perturbative determination of the ratio Z S /Z P is carried out along a line of constant physics in parameter space. In practice, this requirement is met by ensuring a volume of almost constant spatial extent L ∼ 1.2 fm in physical units, with Schrödinger functional boundary conditions. The ratio between temporal and spatial extent T /L is also kept fixed. This implies that any remaining intrinsic ambiguities in Z S /Z P of O(a 2 ) or higher (in the O(a) improved setup adopted here) disappear smoothly towards the continuum limit. The gauge couplings of our simulations span a range typical for the computations performed by the CLS (Coordinated Lattice Simulations) effort in QCD with N f = 2 + 1 flavours of non-perturbatively improved Wilson fermions [18][19][20][21]. Our Z S /(Z P Z A ) results are divided out by Z A , estimated in Ref. [22]. Our Z S /Z P estimates are subsequently extrapolated to the chiral limit at fixed g 2 0 . Results are obtained from several Ward identities; they differ by discretisation effects. Thus it is possible to create ratios of the different Z S /Z P determinations, and plot them against (powers of) the lattice spacing, confirming the expected scaling behaviour. The statistically and systematically most precise Z S /Z P determination is parameterised as a continuous function of g 2 0 , which is our final answer. This is compared to two other determinations: one is based on ratios of PCAC quark masses with different flavours, employing essentially the same small-volume Schrödinger functional setup [23]; the other is based on the relation between bare current quark masses and bare subtracted quark masses, computed on large volumes with open boundary conditions [20].
Finally, in Section 5 we discuss how Z S /Z P can be used in quark mass determinations along the lines proposed in Ref. [24], but performing the mass renormalisation in the Schrödinger functional scheme and the renormalisation group running non-perturbatively, between renormalisation scales µ had ∼ Λ QCD and µ PT ∼ M W . Such a calculation is subjected to different systematics than the standard ALPHA-CLS method, recently applied in Ref. [25].
Work in progress culminating to this paper had been reported in Refs. [26,27].
2 Chiral Ward identities for Z S /Z P In this Section we will derive chiral Ward identities which relate correlation functions of non-singlet scalar and pseudoscalar composite operators (densities). These enable us to compute non-perturbatively the ratio Z S /Z P , which determines the relative normalisation of these scalar and pseudoscalar densities when the regularisation (Wilson fermion action) breaks chiral symmetry.
First we will derive the pertinent chiral Ward identities in the formal continuum theory. Subsequently, we will show their lattice analogues with Schrödinger functional boundary conditions. The resulting Ward identity computation of Z S /Z P follows very closely that of Z A , described in refs. [13][14][15].
Our notation is pretty standard. Definitions of composite operators of dimension-3, axial transformations and Schrödinger functional (SF) boundary operators are collected in Appendix A. Conventions concerning the su(N f ) flavour algebra are to be found in Appendix B. The lattice spacing is denoted by a, the (squared) gauge coupling by g 2 0 , and the inverse lattice coupling by β ≡ 6/g 2 0 . Bare current (PCAC) and subtracted masses are defined in Appendix C.

Formal chiral Ward identities in the continuum
Under the small axial variations (A.6) of the fermion fields the formal, continuum QCD action in Euclidean space-time transforms as follows: The fermion mass matrix is denoted by M . We work in the flavour symmetric (isospin) limit, so all quark masses m are degenerate. In the last expression we have integrated by parts the term with the axial current. Chiral Ward identities are obtained by considering that under the change of field variables defined in Eqs. (A.5), the expectation value of any composite operator O (and products of them) is invariant. In the limit of small axial variations this leads to: We now take the axial variations to be non zero only in a space-time region R with a smooth boundary ∂R (i.e., for x ∈ R, a (x) = 0; otherwise a (x) = 0). The above expression reduces to We consider a product of composite operators O = P b (y)O ext , where y ∈ R and O ext is defined outside the region R. This implies that The pseudoscalar density P b (x) transforms as follows: At this stage we impose that c (x) = δ ac ; i.e., it is a constant phase in a fixed direction a in flavour space, so that Ward identities become expressions reflecting global chiral symmetry. Moreover, in order to sidestep a number of complications 2 , we chose a = b, so that the last term on the r.h.s. of Eq. (4) drops out 3 . Putting everything together, we obtain We note in passing that the first term is a surface term: As done in Ref. [13] for Z A , we chose the region R to be the space-time volume between the hyper-planes at y 0 − t and y 0 + t 4 . Boundary conditions in space are periodic, implying R dx 0 d 3 x∂ k A k · · · = 0. The Ward identity becomes It is convenient to introduce a spatial integration over y: The second line of the l.h.s. contains a contact term, arising when r ≡ |x − y| → 0. The operator product is expressed in terms of an OPE (recall that a = b) 2 With Wilson fermions, the singlet scalar operatorψ(x)ψ(x) mixes with the identity operator, introducing the complication of power divergences. Moreover, Wick contractions of the fermion fields of this operator generate quark-disconnected diagrams. 3 Here we are working with the algebra su(N f ) for N f ≥ 3; for N f = 2 we have that d abe = 0 and the r.h.s. of Eq. (4) is trivial. 4 This choice of hyperplanes is made for simplicity. A more general choice, y 0 − t − and y 0 + t + , with t − = t + and t − ,t + > 0, is also acceptable.
where [D] is the operator dimension and the Wilson coefficients C k contain logarithms. The most divergent term in the OPE, taking into account the various symmetry properties of the operator product, is proportional to S e (x). The contribution to the space-time volume integral 2m R · · · of a small four-sphere of centre x and radius a (or a small four-cube of size a) is then ∼ m a 0 dr r 3 r D−6 · · · ∼ m a D−2 · · · and thus the leading term in the OPE contributes O(am). In the lattice regularisation this implies that the contact term contributes an O(am) discretisation effect to the Ward identity, even in a Symanzik-improved setup.

Lattice Ward identities with Schrödinger functional boundary conditions
We now adapt the previous formal manipulations to the lattice regularisation with Schrödinger functional boundary conditions. The external source for the Ward identity correlation functions is chosen to be a tensor in flavour space O ad ext : with O a and O d defined in Eqs. (A.7). With this source and in lattice notation the Ward identity (8) becomes (with b = c): In this expression, repeated flavour indices e are summed, as usual. The weight factor is w(x 0 ) = 1/2 for x 0 = y 0 ± t and w(x 0 ) = 1 otherwise. It is introduced in order to implement the trapezoidal rule for discretising integrals. The mass m is the current quark mass defined in Eq. (C.5); recall that we work with degenerate masses. Assuming that we work in the chiral limit (or with nearly vanishing quark masses, so that O(am) effects may be safely neglected), the above Ward identity is valid up to O(a 2 ) dicretisation errors in lattice QCD with Wilson quarks. Chiral symmetry breaking implies the (re)normalisation and improvement properties summarised in Appendix C. The Symanzik b-coefficients appearing in Eqs. (C.2)-(C.4) multiply the subtracted quark mass m q or the quark mass matrix M q . When working in or close to the chiral limit, as is the case in our simulations, we may safely drop these terms. Putting everything together we obtain Ward identity (11). The renormalisation factors of the external sources O a and O d are not taken into consideration, as they cancel out on both sides of the identity. Note that the term proportional to the current quark mass m may also be dropped in the chiral limit. In practice, since we are always working with masses that are not strictly zero, it turns out that it is advantageous to keep this term; see Ref.
Eq. (11) can be solved for Z S /(Z P Z A ). With Z A known either from other PCAC Ward identities [13][14][15] or from the chirally rotated Schrödinger functional formalism [22], we can thus obtain Z S /Z P .

Lattice Ward identities, Wick contractions, and flavour factors
Ward identity (11) relates expectation values of four composite operators on the l.h.s. to those of three composite operators on the r.h.s.; with a slight abuse of terminology, we call these four-and three-point correlation functions, respectively. We express these correlation functions, with Schrödinger functional boundary fields, in terms of traces of quark propagators. In standard AL-PHA notation [28], [ψ(y)ψ(x)] F denotes a quark propagator in a fixed background gauge field configuration, where x and y are space-time points in the bulk of the lattice. Propagators from the x 0 = 0 boundary to the bulk are [ζ(v)ψ(y)] F (with v a point at the x 0 = 0 boundary), while those from the x 0 = T boundary to the bulk are [ζ (v )ψ(y)] F (with v a point at x 0 = T ). Boundaryto-boundary propagators are [ζ (v )ζ(u)] F . For proper definitions see Ref. [28]. Note that, since we are working in the su(N f )-symmetric limit, all masses are degenerate and quark propagators of different flavours are indistinguishable 5 .
Performing the Wick contractions, we write the threepoint correlation function of Eq. (11) as where T aed ≡ Tr(T a T e T d ) are traces of three flavour su(N f ) generators and F S;1 (y 0 ), F S;2 (y 0 ) are expectation values of traces of quark propagators with a scalar insertion. The exact expressions can be found in Table 1. Note that traces Tr act in flavour space, traces tr act in spin-colour space, and · · · denote averages over gauge field configurations. In Fig. 1 we show the quarkline diagrams corresponding to the spin-colour traces in the above equation. Any Wick contraction between fermion fields at the same point in the bulk [ψ(y),ψ(y)] F , or between boundary fields at the same time-slice (e.g. [ζ(v)ζ(u)] F ) gives rise to a quark-disconnected diagram 6 , multiplied by the trace of an su(N f ) generator. As this trace is zero, such diagrams do not contribute to the three-point correlation function. An example of such a diagram is shown in Fig. 1.
In Appendix D we combine the usual γ 5 -Hermiticity property of quark propagators, charge conjugation invariance of the lattice theory, and the trace properties of Eq. (B.4), to cast the r.h.s. of Eq. (12) into a single real term, and obtain for the r.h.s. of the Ward identity (11): Next we concentrate on the l.h.s. of Eq. (11). For simplicity we drop, for the moment, the term proportional to the quark mass. The l.h.s. consists of boundary-toboundary correlation functions with two insertions of dimension-3 operators in the bulk, which can be cast in the general form Upon performing the Wick contractions, each correlation function is expressed as the sum of 9 terms. They are products of traces of flavour matrices (denoted as T abcd k ) and traces of loops of quark propagators averaged over gauge field configurations (denoted as F AP;k (x 0 , y 0 )). The former traces are defined as: T abcd 6 It is common practice to refer to these diagrams simply as disconnected. Since from a strict field-theoretic point of view they are connected (with multitudes of gluon lines, some of which contain fermion loops), the term quark-disconnected is more appropriate (valence-quark-disconnected would be even more accurate, but far too long). In the literature, quark-connected and quark-disconnected are sometimes referred to as one-and two-boundary diagrams. while the latter ones are also given in Table 1. The spincolour trace diagrams are shown in Fig. 2. We see that there are six quark-connceted diagrams, and three quarkdisconnected ones. The condition b = c implies that T 9 F AP;9 (x 0 , y 0 ) = 0, due to the vanishing of Tr( Once more we combine γ 5 -Hermiticity, charge conjugation invariance, and Eq. (B.5), to obtain for the l.h.s. of the Ward identity (11): . (21) Note that correlation functions F AP;k are real for k = 1, . . . , 9. See Appendix D for more details. We will use a somewhat more compact notation, defining Collecting Eqs. (13), (21), and (22), we write the Ward identity (11) in the chiral limit as: In order to keep the equation simple, we have not shown the mass-dependent terms with two pseudoscalar density insertions, appearing in Eq. (11). These terms are included in the numerical analysis, which is carried out close to, but not strictly at the chiral limit.  The reader should have no difficulty convincing himself that they are exactly analogous to F AP;k (y 0 + t, y 0 ) and F AP;k (y 0 − t, y 0 ) appearing above. Their net effect is to add extra mass-dependent contributions to the ∆ k (y 0 , t) functions. From now on, the ∆ k (y 0 , t) functions are meant to include these contributions, proportional to the quark mass. Consequently, the uncertainty on the r.h.s. of Eq. (23) becomes O(am, a 2 ).
It is interesting to compare the Ward identities we have derived here to the one introduced in Ref. [13] for the determination of Z A . The former are valid for N f ≥ 3, while the latter for N f ≥ 2. The Ward identity of Ref. [13] involves correlation functions with two axial current insertions in the bulk. In our case we have more complicated contributions, consisting of time-differences of correlation functions with one axial current and one pseudoscalar density insertion.

Determination of Z S /(Z P Z A ) from Ward identities
Ward identity (23) is a master equation, from which a plethora of relations arise for specific choices of flavour indices a, b, c, d. In what follows, each of them will be distinguished by the label WI(abcd). Not all of them are suitable for the determination of Z S /Z P . The following constraints need to be imposed: In spite of these constraints, a lot of freedom remains in the choice of flavour indices, resulting in many Ward identities. They are relations between the correlation functions of the master equation, which can be solved for Z S /(Z P Z A ). These Ward identities can be grouped into different equivalence classes. Each class consists of several identities WI(abcd) with different flavour indices a, . . . , d, but identical flavour factors Re (T k ) (k = 1, 3,5,7,8), and thus the same Eq. (23). Therefore, the same Z S /(Z P Z A ) estimate is obtained from all Ward identities of the same equivalence class. Estimates of Z S /(Z P Z A ) from Ward identities of different classes differ by discretisation effects.
The combinations of conditions (i)-(iii) simmer down to the choice of flavour indices (a, b, c, d), with b = c, such that d bce d ade = 0. We systematically investigated the choices of flavour indices which fulfill these conditions with a computer algebra program and grouped them into the equivalence classes which are tabulated in Table 2 Table 3. We will see below how this leads to useful relations between certain ∆ k functions. -The quark disconnected traces ∆ 7 and ∆ 8 do not contribute to the equivalence classes of the top half of Table 2 (separated by a triple line from the bottom half).
In Table 3 we collect the flavour factors Re (T k ) (k = 1, 3, 5), T 7 , and T 8 for each class. Depending on the choice of flavour indices a,b,c,d, some of these flavour factors vanish. This simplifies the resulting Ward identity. Also here the top part of the Table (separated by a double line from the bottom half) lists the Ward identities without ∆ 7 -and ∆ 8 -type contributions.
There are two possible ways of using the 11 Ward identities of Table 3. A first approach would be to determine Z S /(Z P Z A ) from each of the 11 variants of Eq. (23). In principle these determinations differ by O(am, a 2 ) effects and that should provide a handle for a good control of the related systematics. However, in practice the different Z S /(Z P Z A ) results are all obtained from the same configuration ensembles and are thus strongly correlated. Moreover, paired Ward identities (in the sense discussed above; cf. Table 2) have very similar relations between their ∆ k -terms and this also leads to very similar Z-ratios.
A second approach would be to combine these Ward identities in order to first obtain relations between the various ∆ k -terms. These would be true up to O(am, a 2 ) at fixed gauge coupling, and once established, would simplify the equation(s) relating Z S /(Z P Z A ) to the ∆ k 's. In this spirit we proceed as follows: (i) Starting from Ward identities without quark disconnected contributions (i.e., with Re (T 7 ) = Re (T 8 ) = 0; top part of Table 3), we combine the pair WI(1245) and WI(1425) to obtain: Note that by combining the pair WI(1486) and WI (1846) we also obtain the above expressions, so this pair does not provide extra information.
(ii) WI(1468), which has no partner, is written, in terms of the ∆'s defined in Eq. (22), as: This on its own determines the ratio Z S /(Z P Z A ). Note that combined with Eq. (24), it gives us Eq. (25). Our conclusion is that all Ward identities with Re (T 7 ) = Re (T 8 ) = 0 reduce to the equality ∆ 1 = ∆ 5 (i.e., diagrams F AP;1 and F AP;5 of Fig. 2 are related) and a single Ward identity, from which Z S /(Z P Z A ) may be computed.
(iii) Passing to Ward identities with quark-disconnected contributions (bottom part of Table 3), we combine the pair WI(1188) and WI(1818) to obtain: where Eq. (24) has also been used to arrive at Eq. (28).
(v) If we now combine Eqs. (28) and (30), we obtain again Eq. (25) and the new relation The bottom line is that, up to O(am, a 2 ) discretisation effects, the 11 Ward identities corresponding to the entries of Table 3 are not all independent. They can be combined to give three relations between the functions ∆ k , which depend on traces of valence quark propagators, without references to flavour traces; these are Eqs. (24), (27), and (31) 7 . The extent to which these relations are fulfilled at non-zero lattice spacing is an indicator of the size of discretisation effects. Moreover, if we take them at face value, the remaining Ward identities (25), (26), (28), and (30) reduce to a single expression. Any of them can be used to provide estimates of the ratio Z S /(Z P Z A ). We expect Eqs. (28), and (30) to be noisier, as they involve quark-disconnected diagrams. Eq. (25) seems promising, as it only involves ∆ 1 and ∆ 3 , but it cannot be excluded a priori that Eq. (28) turns out to be better behaved. This can only be decided by numerical investigation. Of course, these considerations do not exhaust all possibilities. Any linear combination of the Ward identities considered above, possibly combined with the relations (24), (27), (31), can be used for the computation of Z S /(Z P Z A ). For example, the linear combination L 1 ≡ [WI(1245)−WI(1425)], combined with Eq. (24) gives: The determination of Z S /(Z P Z A ) from the above depends only on quark-connected diagrams. Similarly, the linear combination L 2 ≡ [12WI(1818)−8WI(1414)] gives: which yields a Z S /(Z P Z A ) estimate from quark-connected and quark-disconnected diagrams. The last two expressions will be used in the following for numerical crosschecks. 7 As an aside we note that Eqs. (24) and (27) relate correlation functions of similar topology (quark-connected or quarkdisconnected ones). On the contrary, Eq. (31) is more intriguing, as it relates quark-connected to quark-disconnected diagrams.

Numerical setup and results
We investigate the proposed Ward identities on lattices with tree-level Symanzik improved gluons and Wilson-Clover quarks. The action coincides with the one used by CLS [18,20,21]. We employ Schrödinger functional boundary conditions in time, which enable us to simulate at quark masses close to the chiral point and control systematic effects related to the massless renormalisation framework. The details of this aspect are discussed in Subsection 4.1. Similar to the procedure in [15], we construct boundary-to-boundary three-and four-point functions with pseudoscalar Schrödinger functional wall sources and use wavefunctions at the boundaries as explained in [29]. The statistical error analysis is performed using a python implementation of the Γ -method [30] (exploiting information from the autocorrelation function) with automatic differentiation [31]. The gauge ensembles used in this study are detailed in Table 4. They coincide with the ones used in [23] but for the ensemble C1k1. These are essentially the ensembles used in [15,29] plus the ensembles A1k3, A1k4, B1k4, C1k1, D1k2 and D1k4, which were added to improve the chiral fits. For the two ensembles E1k1 and E1k2 the number of molecular dynamics units was increased by factor of more than 4. The ensembles with volume L 3 × T described above are designed to lie on a line of constant physics (LCP), where the spatial extent of L ≈ 1.2 fm and T /L ≈ 3/2 are almost constant. The Ward identity conditions which fix the ratio Z S /(Z P Z A ) are imposed at constant physics, i.e., we require that all length scales in the correlation functions, which define a given condition formulated through one of the foregoing Ward identities, are kept fixed in physical units. Once this requirement is satisfied, only the lattice spacing a changes as g 0 is varied. Consequently, renormalisation constants (as well as their ratios) extracted from different constant physics conditions are expected to rapidly approach an almost unique function of g 0 as g 0 → 0. For a more general discussion of the constant physics idea in a similar context see, e.g., Ref. [32]. The initial tuning of this LCP was done based on the (universal) 2-loop beta-function as explained in Ref. [29]. Thus the volume of the lattices varies by ≈ 10% over the range of couplings considered. However, using the results of Ref. [19], we verified that this deviation is proportional to the lattice spacing a and thus contributes to our quantity of interest only as a higher-order ambiguity 8 .  : Summary of simulation parameters: the first column (ID) labels our gauge configuration ensembles, the second column lists the lattice sizes L 3 × T /a 4 , the third one the inverse gauge couplings β, the fourth the Wilson hopping parameters κ, the fifth shows the total number of molecular dynamics units MDU, the sixth the autocorrelation time of the slowest mode τexp, and the last one the corresponding lattice spacing a, estimated from Ref. [19].
The simulations in this work suffer from critical slowing down of the topological charge for smaller lattice spacings. This phenomenon, often dubbed "topology freezing", could give unreliable results due to an insufficient sampling of topological sectors. We circumvent this problem by reweighting all data to the trivial topological sector Q = 0 at the cost of decreasing the effective number of configurations; see [29,34] for a discussion. Furthermore we increase the statistical uncertainties by attaching a tail to the integrated autocorrelation functions as proposed in [35]. As measure for τ exp , the autocorrelation time of the slowest mode in the simulation, we use the integrated autocorrelation time of the squared topological charge Q 2 extracted from the longest Monte Carlo chain for each value of β. The τ exp -values for the individual ensembles can be found in Table 4.
In order to solve the Ward identity for Z S /Z P we need non-perturbative knowledge of the non-singlet axial current renormalisation constant Z A and the O(a) improvement coefficient c A . The constant Z A was calculated on a subset of the gauge configurations in this work, Ref.
[15], as well as in the chirally rotated Schrödinger functional, Ref. [22], which is a completely different determination. We prefer the results from the latter because of their smaller statistical uncertainties. The errors of Z A are accounted for in quadrature when solving for Z S /Z P in our Ward identity expressions. For c A we use the results of [29], without error, following standard practice.
In principle the ratio we would like to determine, as well as all correlation functions involved, depend on the O(a) improved couplingg 2 0 = g 2 0 [1+ab g tr M q /N f ], where the coefficient b g is only known at 1-loop perturbation theory [2]. This issue is of no relevance here, as all normalisation conditions are imposed at zero quark mass. However, this should be kept in mind when using results obtained here in a different setting with non-vanishing sea quark masses.
In order to study the scaling behaviour of some of our results, we need the lattice spacings in physical units at the bare couplings used in this work. In Ref. [19], such values are provided for couplings close to those in Table 4; these enable us to extract the lattice spacings at our gauge couplings using a polynomial interpolation.
As additional cross checks we investigate the nonperturbative validity of the identities (24), (27) and (31). The results can be found in Appendix E.

Chiral extrapolation
From the plethora of possible renormalisation conditions listed in Section 3, we single out a class labeled WI(1468) to which only quark connected diagrams contribute and for which the statistical precision is best. We detail the analysis for this specific choice, but the same steps also apply to any other identity discussed in the following. In order to obtain Z S /Z P at vanishing quark mass, we extra-or interpolate the data at fixed bare coupling to the chiral point. For this procedure we employ the O(a) improved PCAC mass, which we average over the central third of the temporal extent of the lattice, similarly to what was done in Ref. [  insertion times in the master equation (23), we chose y 0 = T /2 and t = T /6 rounded up to the closest integer 9 . The idea behind this choice is to place the operators as far away from the temporal boundaries as possible, so as to suppress boundary induced cutoff effects, while keeping the individual operators apart from each other, thus avoiding contact terms.
In Fig. 3 we show the chiral extrapolation of our preferred determination WI(1468), at β = 3.676, where quark masses cover a large range in lattice units. We compare results obtained from the Ward identity with and without the mass term (i.e., the term with two pseudoscalar insertions in Eq. (11)). We see that in the "massive" case our results display a linear behaviour in the whole mass range. In addition statistical uncertainties are smaller and the data show an almost flat dependence on am, resulting to a more reliable chiral extrapolation. Therefore, we obtain Z S /Z P in the chiral limit by fitting linearly the results of the "massive" case. For this fit we employ orthogonal distance regression [36] which takes into account not only errors in the dependent, but also in the independent variable. The error obtained from this procedure for the chirally extrapolated Z S /Z P is in general larger compared to the one obtained from a standard least squares fit. Results for the individual ensembles as well as the chiral extrapolations are summarised in Table 5, which will be discussed in Subsection 4.2. 9 As discussed in Ref.
[15], the temporal extent of our lattices is odd, so there is no central time-slice.

Scaling
In Table 3 we have listed 11 classes of distinct Ward identities; each of them is a different relation between correlation function differences ∆ k (k = 1, 3, 5, 7, 8) and F S;1 , from which Z S /Z P may be obtained. In Fig. 4 we show these determinations in the chiral limit as functions of the gauge coupling g 2 0 . It is evident, as argued in Section 3, that there are very strong correlations between results obtained on the same configuration ensembles from "similar" Ward identity classes, as grouped in Table 2.
We are thus led to select, from the plethora of Ward identities, four representative determinations of Z S /Z P . Two of these involve only quark connected diagrams. These are WI(1245) and the linear combination L 1 , leading to Eq. (32). The other two determinations involve both quark connected and disconnected diagrams and are therefore numerically more challenging. Here we chose WI(4488), and the linear combination L 2 , leading to Eq. (33). The results for each ensemble and in the chiral limit are shown in Table 5.
To evaluate the relative cutoff effects among our different results, we form ratios of Z S /Z P , obtained from each of the four determinations described above, to Z S /Z P from our preferred identity WI(1468). We investigate the lattice spacing dependence of each of these four ratios which, in our Symanzik-improved setup, consists of powers of a 2 and higher. The ratios are known to tend to unity in the continuum limit. We therefore fit them with polynomials in the lattice spacing, constrained to be 1 at the origin. Results are displayed in Fig. 5. The top panel of the figure displays results from the first two determinations, without quark disconnected contributions. The deviations from 1 in the ratio WI(1245)/WI(1468) are very mild and can be described by a single term quadratic in the lattice spacing with χ 2 /d.o.f = 0.474. For the ratio L 1 /WI(1468) the deviation from 1 as well as the statistical uncertainties are larger. A glance at Fig. 5 should convince the reader that the data cannot be described by a single-parameter fit with a quadratic term. Fitting with 1 + c 2 a 2 + c 3 a 3 results to c 2 = −9.    (32) and (33). In all Ward identities the mass terms with two pseudoscalar insertions in the bulk have been included; cf. eq. (11). The errors quoted for the individual ensembles are statistical; the uncertainty on the values at the chiral point stem from the orthogonal distance regression procedure of Ref. [36]. Our preferred Z S /Z P estimates are obtained from the WI(1468) results (in boldface).
fit shown in the Figure.  We did not find any evidence for O(a) cutoff effects; trying to fit an additional term proportional to a gives coefficients which are zero within errors.

Interpolation formula
To facilitate the use of our Z S /Z P results in large volume simulations, we provide an interpolation formula for lattice spacings 0.04 fm a 0.1 fm. Having tried several fit ansätze, we opt for a Padé interpolation constrained by the 1-loop value [37] of the form with the covariance matrix cov(Z As the functional form in the non-perturbative coupling region is in principle unknown, we investigated the significance of systematic effects by also experimenting with alternative forms of interpolating functions (such as higher-order Padés, exponentials and polynomials), constrained to monotonically approach the 1-loop perturbation theory result. However, among those describing our results reliably (as signaled by an acceptable χ 2 /d.o.f.) practically coincide with the interpolation (34) in the fitted range of couplings, so that the associated systematic errors are negligible compared to the statistical ones. Therefore, we only account for systematic uncertainties when extrapolating with Eq. (34) to values slightly outside the fitted range by adding a systematic error of 50% of the size of the statistical one in quadrature. This prescription is applied at β = 3.85, which corresponds to the finest lattice spacing simulated by the CLS effort.
The WI(1468) results with the interpolation are shown in Fig. 6, where they are also compared to the prediction of 1-loop perturbation theory. The vertical   Table 3. Open symbols are used for the Ward identity classes with connected-quark diagrams only; closed symbols denote Ward identity classes with both connected-and disconnected-quark diagrams. Closely related Ward identities (which are separated by a single horizontal line in Table 2) are shown with the same symbol. Data from WI(1144) are shown at their exact abscissa position, while the others have been slightly displaced in the g 2 0 -direction, in order to improve visibility.   [18,20,21]. The error of the WI(1468) results is the statistical uncertainty propagated from the interpolation formula (34) except for β = 3.85 where we added a systematic uncertainty, 50% of the size of the statistical one, in quadrature. For the results of the two LCP columns we combine the errors of Z (from Ref. [23]) and Z A (from Ref. [22]) in quadrature.
dashed lines mark the bare couplings used in CLS simulations, to which we want to interpolate our results. Results for Z S /Z P at the g 2 0 -values used in N f = 2 + 1 CLS simulations are given in Table 6.

Comparison with previous works
We are not aware of any direct determinations of Z S /Z P in our specific setup, but we can compare our findings, using existing results for the quark mass renormalisation constant Z ≡ Z P /(Z S Z A ). The idea is to compute Z S /Z P = (ZZ A ) −1 , with Z from either Ref. [20] or Ref. [23], and Z A from Ref. [22]. In Ref. [20], Z has been computed on large-volume CLS ensembles, from the relation between PCAC quark masses m ij and subtracted quark masses m q,ij (see Section 5 and Appendix C for these mass definitions). The Z-results in Ref. [23] were obtained on almost the same gauge ensembles used in this work 10 at small volumes and nearly-chiral sea quark masses. The method of Ref. [23] is based on suitable combinations of renormalised quark masses, defined both through the PCAC relation and the subtracted bare mass, evaluated in the O(a) improved theory with non-degenerate valence quarks, including all necessary counterterms. Results are quoted for two different lines of constant physics labeled LCP-0 and LCP-1, which differ by the values at which the quark masses in the valence sector are kept fixed as g 0 is varied. We compute the ratio of 1/(ZZ A ) from Refs. [20] and [23] to Z S /Z P from our preferred WI(1468). We investigate the lattice spacing dependence of this ratio, which consists of powers of a 2 and higher, and tends to unity in the continuum limit. The results are plotted in Fig. 7. Polynomial fits are performed on the LCP-0 and LCP-1 ratios, excluding the data of the coarsest ensembles, which display poor scaling behaviour and large errors. A two-parameter fit of the form 1 + c 2 a 2 + c 3 a 3 results to χ 2 /d.o.f = 0.281, c 2 = −2.5(3.7) and c 3 = 242(57) for LCP-0, and χ 2 /d.o.f = 0.166, c 2 = 1.5(2.8) and c 3 = 148(45) for LCP-1, in both cases c 2 is consistent with zero. We thus prefer to plot the results as functions of a 3 in Fig. 7, where we also show a one-parameter fit of the form 1 + c 3 a 3 ; for this ansatz we obtain χ 2 /d.o.f = 0.300, c 3 = 206(14) for LCP-0 and χ 2 /d.o.f = 0.170, c 3 = 169(12) for LCP-1 11 . We interpret this as confirmation that the two methods are compatible w.r.t. the expected 11 Since we neglect correlations between our results and those of Ref. [23], the error in their ratio is probably overestimated. This explains the small values of χ 2 /d.o.f. lattice spacing ambiguities and that the effects of O(a 2 ) are sub-dominant compared to the next higher order.
Let us briefly comment on the possible benefits of the respective results on Z S /Z P collected in Table 6, originating from the different approaches underlying Ref. [23] and this work. First, one observes comparable uncertainties between the two. While the method of that reference involves combinations of simpler and thus typically less noisy correlation functions (i.e., with only one operator insertion in the bulk) as well as an accurate computation of the valence quark mass dependence prior to the chiral extrapolations, our estimates on Z S /Z P from the more direct Ward identity approach followed here exhibit an overall flatter and, at larger couplings, less steep g 2 0 -dependence. This points to generically smaller cutoff effects so that continuum extrapolations of quantities where it enters may be expected to become better controlled and more precise in the long run, because they are also less affected by unpleasantly significant admixtures of higher-order cutoff effects.
The results for Z presented in [20], stemming from large-volume calculations on a subset of the CLS ensembles, are only available at two values of the bare coupling, which do not coincide with the couplings investigated in this work. In order to compare with our results we make use of the interpolation formula Eq. (34). Although the estimates for Z from Ref. [20] are only available at two values of the bare coupling and we hence do not attempt a fit in this case, we notice that they are compatible with LCP-0.
In summary, comparison with earlier works is consistent with the expectation that all ambiguities between different determinations of Z S /Z P show a scaling according to O(a 2 ) or higher. However, the size of these ambiguities is quite large and may still have a relevant impact on applications as described in the next Section.

Application: quark mass computations with Wilson fermions
We will now discuss a method of computing quark masses with Wilson fermions which uses the ratio Z S /Z P .
First we review the well-established "PCAC quark mass method". It is the conventional ALPHA Collaboration approach, which relies on the PCAC definition of quark masses m ij of Eq. (C.7). These bare current masses are computed on large physical volumes 12 and for a range of couplings typical of hadronic, low-energy 12 The ALPHA Collaboration has performed these calculations for quenched QCD with Schrödinger functional boundary conditions; see Ref. [38]. The CLS effort determined quark masses for N f = 2 QCD with periodic boundary conditions [39,40] and for N f = 2 + 1 QCD with open boundary conditions [25,41]. scales µ had ∼ Λ QCD . Although we keep our notation as general as possible, for concreteness we consider a theory with N f = 2 + 1 dynamical fermions; i.e. the two lightest flavours are degenerate in mass while the third flavour is heavier (m q,1 = m q,2 < m q,3 ).
We see from Eq. (C.8) that the renormalised light mass is given by The ratio of the heavy to light renormalised masses is also derived from the above expression: =2 Knowing the renormalised light mass from Eq. (35), and the ratio of the heavy and light renormalised masses from Eq. (36), the up/down and strange masses are obtained [19,25]. So in principle this method requires: 1. The axial current normalisation Z A (g 2 0 ) and the renormalisation constant Z P (g 2 0 , µ had ) of the nonsinglet pseudoscalar density; the latter carries the renormalisation scheme and scale dependence of the continuum quark mass. In our N f = 3 setup, these may be found in Refs. [22] and [42], respectively. 2. The Symanzik-improvement coefficients (b A − b P ) and (b A −b P ). Non-perturbative (b A − b P )-estimates in our setup may be found in Ref. [23]. Note that in perturbation theory (b A −b P ) ∼ O(g 4 0 ), so that the term proportional to this coefficient is habitually dropped. 3. It is also noteworthy that Eq. (36) does not require knowledge of κ crit , which is however needed in m q, 12 and Tr(M q ) in Eq. (35). We shall return to this point in Subsection 5.1.
Based on the results of Ref. [43] for Symanzik-improved quark masses with Wilson fermions, an alternative approach, known as the "ratio-difference method", has been proposed in Ref. [24]. The renormalised quark mass difference is given by    [20,23] to Z S /Z P from WI(1468). malisation scheme and scale dependence of the continuum quark mass.

The Symanzik-improvement coefficients (b
b m andb m . Non-perturbative estimates of the b mcoefficient in this setup may be found in Ref. [23]. 13 Sinceb m ∼ O(g 4 0 ), the term proportional to Tr(M q ) is habitually dropped. 3. The critical hopping parameter κ crit is needed in m q, 13 and Tr(M q ) in Eq. (37). We shall return to this point in Subsection 5.1.
We have outlined the basic idea behind the PCAC quark mass method and the ratio-difference method, listing the renormalisation parameters and improvement coefficients required by each one. The most crucial difference is that in the PCAC quark mass method all bare masses are given in terms of the current masses m 12 and m 13 , which are renormalised by Z −1 P Z A , while in the ratio-difference method the bare mass difference is the exactly known [m q,3 − m q,1 ], which is renormalised by Z −1 S . It is not possible to determine Z S with a Schrödinger functional renormalisation condition analogous to that introduced in Ref. [44] for Z P . The latter involves correlation functions with a pseudoscalar source at the boundary (see Eq. (A.7)) and the pseudoscalar 13 In perturbation theory 2bm = −1 + O(g 2 0 ) and the nonperturbative estimates of Ref. [23] are also numerically sizeable. Thus this Symanzik counterterm is expected to remove large O(a) effects, especially in future computations of heavy flavour quark masses (charm etc.). scalar operator at the bulk. If we place a scalar operator at the bulk, keeping the pseudoscalar boundary source, the correlation function vanishes due to parity. Nor is it possible to have a scalar source at the boundary and the scalar density at the bulk, since this would result in the product P + P − of the projection operators of the boundary quarks and the vanishing of the correlation function. An option would be to impose a renormalisation condition on the correlation function O a S b (x) O c , with the two pseudoscalar boundary sources O a and O c and the scalar operator S b in the bulk. This would be an acceptable intermediate scheme of the Schrödinger functional variety, but different than the one introduced in Ref. [44] for Z P . Thus, the renormalised quark masses m 1R , m 3R obtained by combining Eqs. (35) and (36) (PCAC quark mass method with Z P ) would be in a different scheme than those obtained from Eqs. (37) and (36) (difference-ratio method with Z S ). Only results obtained for the scheme-independent renormalisation group invariant (RGI) masses from the two methods would be comparable. This comparison would be very useful but cumbersome, as it requires the computation from scratch of the step scaling function in the new intermediate scheme, from ratios of Z S 's at fixed renormalised coupling and two different renormalisation scales, and for a range of couplings.
Given the above considerations, we are led to define the scalar operator renormalisation parameter through: This is our definition of the Schrödinger functional renormalisation scheme for the scalar non-singlet operator. The Z S /Z P -ratio on the r.h.s. is scale independent, being determined from Ward identities. Clearly, scalar and pseudoscalar densities have the same renormalisation group running properties (i.e., the same anomalous dimensions, the same step scaling functions in the continuum, etc.). So knowledge of the Z S /Z P ratio enables us to obtain the light and heavy quark masses in the usual Schrödinger functional scheme [44], but with a different method based on mass differences (and Z S ) combined with scale-independent PCAC mass ratios. The novel renormalisation and improvement patterns provide an important handle for the control and reduction of systematic effects related to the non-perturbative determination of renormalisation parameters and discretisation errors 14 . What is common in both methods is the renormalisation group running that takes us non-perturbatively from renormalised masses at low energy scales µ had to masses at large, perturbative scales µ PT ∼ M W , as described in Ref. [44]. For recent results on the running of quark masses in N f = 3 QCD see Ref. [42].

Subtracted masses, PCAC masses, and redefined Symanzik counterterms
We will close this section by reviewing how, in both methods, we can circumvent the need to use κ crit in the Symanzik counterterms of Eqs. (35) and (37), which feature subtracted masses am q,ij and Tr[aM q ]. This can be avoided by substituting these subtracted masses with current quark masses. Their relation is given by [43], where Z(g 2 0 ) ≡ Z P /(Z S Z A ) and r m ≡ Z S /Z S 0 are finite normalisations (Z S 0 is the renormalisation parameter of the singlet scalar density). In the above we neglect O(a) terms, as they only contribute to O(a 2 ) in the b-counterterms of Eqs. (35) and (37). Substituting am q,ij → am ij in these expressions, we obtain respec- 14 This could be crucial in computations of heavier quark masses (charm etc.), where the discretisation errors become dominant. tively and 1 + 2b m am 13 Thus, am q,ij and κ crit in Eqs. (35) and (37)  . A first non-perturbative study of the coefficientsb A ,b P , andb m produced noisy results with 100% errors [46]. Since in perturbation theory (b A −b P ),b m ∼ O(g 4 0 ) [43], the terms proportional to M sum are habitually dropped.
For completeness we also discuss a slightly different way to write the b m -counterterm of the renormalised quark mass difference of Eq. (37), in close analogy to what is done in Ref. [24]. The term in question is written as follows: We arrive at the second expression using Eq. (39) and introducing the PCAC mass m 33 , which consists of two degenerate but distinct heavy valence flavours. Neglecting the term proportional to M sum in Eq. (44), we conclude that in this approximation the difference-ratio method is based on Eqs. (36) and (37), which depend on the exactly known subtracted quark mass difference [m q,3 − m q,1 ] and suitable PCAC quark mass ratios, but not on subtracted quark mass averages m q,ij and κ crit .

Conclusions
In the present study we have addressed, for the first time within the finite-volume Schrödinger functional setup, the non-perturbative determination of the ratio of the scalar to pseudoscalar non-singlet renormalisation constants Z S /Z P in Wilson's lattice QCD, exploiting suitable massive chiral Ward identities. We have shown that in lattice QCD with three flavours of Wilson-Clover quarks (with non-perturbative c sw [47]) and tree-level Symanzik-improved gauge action, the Ward identities are restored up to O(a 2 ) at finite lattice spacing. In order to ensure a smooth dependence of the renormalisation constant ratio on the bare gauge coupling, we have enforced a constant physics condition by working with an approximately fixed physical volume of spatial extent L ≈ 1.2 fm and T /L ≈ 3/2. Our main results are the parameterisation of Z S /Z P in Eq. (34), valid for bare couplings 1.55 g 2 0 1.85 (i.e., lattice spacings 0.042 fm a 0.105 fm), as well as the values for Z S /Z P , given in Table 6, at the bare couplings typically employed in the large-volume N f = 2 + 1 CLS ensembles [18][19][20][21]. On the technical level, we had to treat properly the topology freezing encountered in our simulations, principally at the finest lattice spacing, which may prevent a trustworthy estimation of the statistical error. The operator character of Ward identities ensures their validity in sectors of fixed topological charge. Thus we have projected the correlation functions entering the Ward identities onto the trivial topological sector throughout our analysis.
Several checks have been performed, in order to guarantee the stability of the analysis and a careful assessment of the statistical as well as the systematic errors. In particular, we have verified that results on Z S /Z P from the different classes of Ward identities at our disposal are perfectly consistent with each other as expected, i.e., up to ambiguities of O(a 2 ) or even higher. Among the various estimators for [Z S /Z P ](g 2 0 ), our preferred choice, advocated in Eq. (34), was guided by the structural simplicity of the underlying chiral Ward identity, its numerical precision, and its robustness against systematic effects.
Since the range of couplings covered in this work matches those of the large-volume gauge field configurations generated by CLS with the same lattice action, our result for [Z S /Z P ](g 2 0 ), combined with the scale dependent renormalisation factor Z P from [42], can be used in the computation of quark masses as outlined in Section 5. Work in this direction, extending the (2 + 1)flavour computations of light, strange and charm quark masses on the CLS ensembles reported in refs. [25,41], is in progress.
In the Schrödinger functional framework, standard zero-momentum sources are defined as follows 15 : where ζ and ζ are the quark fields at the Schrödinger functional boundaries x 0 = 0 and x 0 = T , respectively.

Appendix B: Properties of su(N f ) Lie algebra generators
Our conventions for the su(N f ) Lie Algebra are those of Appendix A.3. of Ref. [2]. In general, the anti-Hermitean generators of the algebra satisfy We work in the fundamental representation, with the generators normalised so that The anticommutator of these generators is given by where I N f is the dimension-N f unit matrix. The structure constants f abc are real and totally antisymmetric tensors, while d abc are real and totally symmetric. Two useful identities are For N f = 2 we have T a = τ a /(2i) (τ a are the Pauli matrices), f abc = abc (the Levi-Civita symbol) and d abc = 0.
For N f = 3 we have T a = λ a /(2i) (λ a are the Gell-Mann matrices). The non-vanishing structure constants All operators of interest are flavour non-singlets and, unless otherwise stated, quark masses are degenerate. For Wilson fermions, with O(a) Symanzik improvement, we know that the improved current is correctly normalised a follows: The renormalised and Symanzik-improved scalar and pseudoscalar densities are given by with am q = 1/(2κ)−1/(2κ crit ) the subtracted bare mass; here κ is the Wilson hopping parameter and κ crit its critical value (chiral limit). The mass matrix of subtracted quark masses is denoted by M q . The current (bare) quark mass, which appears in the chiral Ward identities of the present paper, is defined by the PCAC relation (C.5) The renormalised quark mass m R is given in terms of the current mass m by For two distinct flavours i, j, the subtracted quark masses are am q,i = 1/(2κ i ) − 1/(2κ crit ) and similarly for am q,j . The PCAC mass is defined as and the renormalised quark mass average is expressed in terms of m ij as follows: where m q,ij ≡ (m q,i +m q,j )/2. This reduces to Eq. (C.6) for two degenerate masses m q,i = m q,j . In practice for the divergence of the improved axial current we use 16 See Ref. [28] for their definitions. 17 The Dirac matrix conventions used in the present work are those of Appendix A of Ref. [2]. The charge conjugation conventions are those of Appendix B of the same reference. Having shown that the r.h.s. of WI (11) is real, the l.h.s. must also be real. As a crosscheck we show this explicitly. The l.h.s. correlation function is given by Eq. (14), with the traces of flavour matrices given by Eqs. (15)- (20) and the 9 terms F AP;k listed in Table 1. Taking the Hermitean conjugate of these terms we find that the one-boundary ones are related pairwise by complex conjugation, Hermitean conjugation also implies that the quarkdisconnected contributions are real: From these properties it immediately follows that the l.h.s. of the WI is real.
However we want to go a step further and show the reality of the traces F AP;1 , . . . , F AP;9 . For the oneboundary contributions, Eqs. (D.5) imply that Comparing this result to Eq. (D.7) and recalling that QCD correlation functions remain invariant under charge conjugation, we deduce that Im (F AP;1 ) = 0. Analogously, F AP;2 , . . . , F AP;6 are also real. Concerning oneboundary contributions, traces T abcd 7 , T abcd 8 , T abcd 9 are easily seen to be real from Eq. (B.2). The reality of F AP;7 , F AP;8 , F AP;9 then follows immediately from Eqs. (D.6). This completes our proof that also the l.h.s. of WI (11) is real.

Appendix E: Non-perturbative checks
As additional validation of our method we want to make sure that the relations (24), (27) and (31) which relate different diagrams to one another are fulfilled up to ambiguities of O(a 2 ). After making sure that the identities are valid at tree-level of perturbation theory we evaluate them non-perturbatively on our ensembles. The analysis is analogous to the one for the ratio Z S /Z P . After evaluating the identities on each lattice for a given value of β, we perform an extra-or interpolation to the chiral point linear in the current quark mass. The values presented here are the results at the chiral point obtained from this procedure. The clearest evidence comes from identity (24) which we can rewrite as ∆ 5 / ∆ 1 = 1 + O(a 2 ) . (E.1) In the top part of Fig. 8 we present the results which show the expected scaling towards the continuum. The identities (27) and (31) are more complicated to verify as they involve quark disconnected contributions. We can rewrite the identities as follows The numerical results are presented in the bottom part of Fig. 8. In this case the statistical uncertainties are orders of magnitudes larger and grow towards the continuum limit. A possible explanation of this is that the ∆ i involved here are vanishing at tree-level in perturbation theory. Despite the large uncertainties our data still suggest that the identities are fulfilled up to the expected ambiguities in the lattice spacing.