Non-perturbative determination of improvement coefficients b_m and b_A-b_P and normalisation factor Z_m*Z_P/Z_A with N_f=3 Wilson fermions

We determine non-perturbatively the normalisation constant Z_m*Z_P/Z_A as well as the Symanzik coefficients b_m and b_A-b_P, required in O(a) improved quark mass renormalisation with Wilson fermions. The strategy underlying their computation involves simulations in N_f=3 QCD with O(a) improved massless sea and non-degenerate valence quarks in the finite-volume Schroedinger functional scheme. Our results, which cover the typical gauge coupling range of large-volume N_f=2+1 QCD simulations with Wilson fermions at lattice spacings below 0.1 fm, are of particular use for the non-perturbative calculation of O(a) improved renormalised quark masses.


Introduction
Quark masses are amongst the fundamental parameters of the theory of strong interactions. Their high-precision determination is one of the main goals of lattice QCD (see ref. [1] and references therein). These computations suffer from statistical and systematic uncertainties, which can be reduced in a controlled way. An important source of uncertainty are cutoff effects, which are removed by computing a given quantity at several lattice spacings, followed by continuum extrapolation. For several variants of lattice fermions (staggered, domain wall, overlap, twisted-mass) these uncertainties are O(a 2 ), while for Wilson fermions they are O(a). A related problem in the latter formulation is the loss of chiral symmetry, because it complicates the renormalisation properties of most quantities. A frequently cited example of these complications is the power divergence m crit ∼ 1/a that must be subtracted from bare quark masses before they are renormalised multiplicatively. Another example is the fact that the normalisation factor Z A of the axial current and the ratio Z S /Z P of the scalar and pseudoscalar density renormalisation parameters are finite functions of the gauge coupling, which are equal to unity only in the continuum limit where chiral symmetry is fully recovered.
In spite of these shortcomings, Wilson fermions have advantages compared to other popular regularisations, namely strict locality (leading to relatively reduced computational costs) and preservation of flavour symmetry. It is the regularisation of choice of our collaboration, which is part of the effort by the CLS (Coordinated Lattice Simulations) cooperation to simulate QCD with N f = 2 + 1 flavours of non-perturbatively improved Wilson fermions [2][3][4][5].
Wilson fermion O(a) discretisation effects are systematically removed by introducing so-called Symanzik counter-terms in the lattice action and composite operators. These counter-terms are higher dimensional operators with coefficients which are functions of the gauge coupling. The coefficients must be appropriately tuned so that O(a) improvement is achieved. Some of them (c sw , c A , etc.) remove discretisation effects which are present also in the chiral limit, whereas others (b m , b A , b P , etc.) are proportional to the quark masses and improve quantities off the chiral limit. The requirement for improvement in the fermionic sector has been noted early on [6], and only a few strategies to determine them non-perturbatively have been developed so far.
In the present work we compute non-perturbatively the coefficients b m , b A − b P and the renormalisation parameter Z ≡ Z m Z P /Z A in a theory of three sea quark flavours. The methods we use can be traced back to ref. [7]. In that work, renormalised quark masses were defined both through the PCAC bare quark masses and the subtracted bare Wilson masses. In both definitions O(a) improvement is introduced through the inclusion of all necessary c-and b-type counter-terms. Combining these results at constant bare gauge coupling provides estimates of b m , b A − b P and Z. This work has been extended in refs. [8][9][10], where results for other improvement coefficients were also reported. These computations were performed in large volumes with (anti)periodic boundary conditions. In parallel, in ref. [11] (and subsequently in [12]) the method was extended and applied to small physical volumes with Schrödinger functional boundary conditions. These early analyses were carried out in the quenched approximation. More recently, in ref. [13], b m , b A − b P and Z were measured in a theory with N f = 2 sea quarks, employing the Schrödinger functional scheme and working at a constant value of the renormalised coupling, so as to keep the physical extent of the lattice fixed. By thus imposing improvement and renormalisation conditions along a line in lattice parameter space, where all physical scales stay constant, it is ensured that any intrinsic higher-order lattice spacing ambiguities of b-coefficients and Z-factors vanish uniformly as the continuum limit is approached.
Our strategy follows closely that of ref. [13]. However, the extraction of the final estimates from our data has been improved by the introduction of several novelties in the data analysis, which enable us to obtain very reliable estimates in the chiral limit. The lattice action we employ consists of the tree-level Symanzik-improved gauge action [14] and the non-perturbatively improved Wilson-clover fermion action [15]. Our simulations are performed in the range of bare couplings, where gauge configurations on lattices with large physical volumes with N f = 2+1 sea quarks have been generated by CLS [2,4]. These configurations are suitable for the computation of bare correlation functions, on the basis of which low-energy hadronic quantities can be evaluated. In ref. [3] bare PCAC quark masses have been computed from these ensembles. To obtain renormalised up, down, and strange quark masses from these bare masses, one also needs the following: (i) The multiplicative mass renormalisation factor 1/Z P at low energies and its non-perturbative running up to high energy scales; these are known in the Schrödinger functional scheme from ref. [16]. (ii) The axial current improvement coefficient c A and its normalisation constant Z A , which are known from refs. [17] and [18,19], respectively. (iii) The improvement coefficient b A −b P , which is one of the results of this work. (iv) The improvement coefficientb A −b P , which is particularly difficult to estimate but may be ignored, as it is sub-leading in perturbation theory.
Independent estimates of Symanzik b-coefficients, directly computed on CLS ensembles and obtained with a variant of the coordinate space method of ref. [20], have been reported in [21]. A comparison of these results to ours may be found in sect. 5. Preliminary results of the present work have been reported in ref. [22]. For recent determinations of b-coefficients in the vector channel of three-flavour QCD with the same lattice action, see also refs. [23,24].

Quark mass renormalisation and improvement with Wilson fermions
In this section we review the renormalisation and O(a) improvement of quark masses in the framework of lattice regularisation with Wilson quarks. These results were first derived in ref. [25] for QCD with degenerate masses and generalised in ref. [26], which is the basis of our résumé. The starting point is the subtracted bare quark mass of flavour where κ i is the hopping parameter, κ crit its critical value corresponding to the chiral limit with N f degenerate flavours, and a is the lattice spacing. In terms of the subtracted masses m q,i , the O(a) improved, renormalised quark mass is given by We recall in passing that the renormalisation parameter Z m (g 2 0 , aµ) depends on the renormalisation scale µ and diverges logarithmically in the ultraviolet. A mass-independent renormalisation scheme is implied throughout this work. In such a scheme, the Symanzik coefficients b m ,b m , d m ,d m as well as r m are functions of the squared bare coupling g 2 0 . 1 In 1 In an improved mass-independent scheme, the coupling is to be defined asg 2 0 ≡ g 2 0 (1 + bgaTr [Mq]), see ref. [25]. As this coupling redefinition affects b-counter-terms by O(a 2 ) terms, we need not take it into consideration in the present work. a non-perturbative determination at non-zero quark mass, they are affected by O(am q,i ) and O(aTr[M q ]) systematic effects, which are part of their operational definition. They have the following properties: 2 i) The (r m − 1)-term multiplies Tr[M q ], so it arises from a mass insertion on a quark loop; i.e., from diagrams like , where the filled square indicates a mass insertion. It is a two-loop effect, contributing at O(g 4 0 ). Its determination is beyond the scope of this paper.
ii) The b m -term multiplies m 2 q,i . Thus it arises from: (a) the mass dependence of valence quark propagators and (b) the mass-independent contributions of the fermion loops. The former dependence begins at tree-level, so that b m = −1/2 + O(g 2 0 ), corresponding to Feynman diagrams like and .
The latter dependence begins at two loops, contributing at O(g 4 0 ); cf. for example diagram .
The quenched b m -value differs from the one of the full N f theory by the mass-independent contributions of the fermion loops; consequently the difference arises at two loops and is O(g 4 0 ).
iii) Theb m -term multiplies m q,i Tr[M q ]. The factor of m q,i comes from the valence line, while the Tr[M q ] from a quark loop. It thus begins at two loops in perturbation theory (b m ∼ O(g 4 0 )) and vanishes in the quenched approximation; e.g., diagram .
The determination of theb m -term is beyond the scope of this paper.
iv) The (r m d m − b m )-term multiplies Tr[M 2 q ]; so it must arise from two insertions of m q,i on a single sea-quark loop. This combination begins at two loops, so it is O(g 4 0 ); cf. diagram . But since it arises from sea quark propagators, while b m has tree-level and O(g 2 0 ) valence-quark contributions, it follows that also d m must get tree-level contributions and O(g 2 0 ) corrections from valence lines. The determination of (r m d m − b m ) is beyond the scope of this paper.
v) The (r mdm −b m )-term multiplies Tr[M q ] 2 , so it can only arise from mass insertions on two separate quark loops. Thus this term begins at three loops and is O(g 6 0 ); cf. diagram . But sinceb m itself begins at two loops, so mustd m . The determination of (r mdm −b m ) is beyond the scope of this paper.
Next we recall that with Wilson fermions the renormalised quark mass can also be related to the bare PCAC mass m ij , defined through the following relation: where m ij = (m i + m j ) /2. Our notation is standard: the non-singlet bare axial current and the pseudoscalar density are given by with indices i, j denoting two distinct flavours. The pseudoscalar density P ij and the current ( being in principle only a function of the gauge coupling. 3 The operator O ji is defined in a region of space-time that does not include the point x, thus avoiding contact terms. Our specific choice of correlation functions for eq. (2.3) is discussed in appendix B.
Beyond the chiral limit, composite operators require improvement through the introduction of b-type Symanzik counter-terms. The renormalised and O(a) improved axial current and pseudoscalar density are given by [25,26] with m q,ij ≡ (m q,i + m q,j ) /2. The normalisation of the axial current Z A (g 2 0 ) is scaleindependent, depending only on the squared gauge coupling g 2 0 . The renormalisation parameter Z P (g 2 0 , aµ) also depends on the renormalisation scale µ and diverges logarithmically in the ultraviolet. The Symanzik coefficients b A , b P ,b A andb P are in principle only functions of the bare squared coupling. They have the following properties: vi) The b A -and b P -terms multiply m q,ij . Thus they arise from: (a) the mass dependence of valence quark propagators and (b) the mass-independent contributions of the fermion loops. The former dependence begins at tree-level, so that b A , b P = 1/2 + O(g 2 0 ); cf. diagrams and .
The latter dependence begins at two loops, contributing at O(g 4 0 ); cf. diagram .
The difference b A − b P , appearing in eq. (2.8) below, is therefore O(g 2 0 ). The quenched values differ from those of the full N f theory by the mass-independent contributions of the fermion loops; consequently the difference arises at two loops and is O(g 4 0 ).
vii) Theb A -andb P -terms multiply Tr[M q ]. They arise from the mass dependence of quark fermion loops. They begin at two loops in perturbation theory (b A ,b P ∼ O(g 4 0 )) and vanish in the quenched approximation; cf. diagram . The determination of these coefficients is beyond the scope of this paper.

The renormalised PCAC relation
The properties of the various b-coefficients listed above eqs. (2.2), (2.5), (2.6) and (2.8) also apply to the non-unitary theory, where valence and sea quarks of the same flavour have different masses. We saw that all terms containing traces of the fermion matrix refer to sea quarks, while the others refer to valence quarks. This is shown somewhat more explicitly in appendix A. The present and previous works, such as ref. [13], rely on this property in order to obtain reliable non-perturbative estimates of the Symanzik coefficients b m and b A − b P , as well as the combination (2.9)

Non-perturbative determination of b m , b A − b P and Z
If we calculate the average mass (m i,R + m j,R )/2 from eq. (2.2) and equate the result to the r.h.s. of eq. (2.8), we obtain an expression, which relates subtracted and PCAC bare masses: Note that the product of the renormalisation parameters Z P (g 2 0 , µ)Z m (g 2 0 , µ) is scale independent.
As discussed previously and in appendix A, eq. (2.10) remains valid in the non-unitary theory, with m q,i denoting valence quark masses and M q the mass matrix of sea quarks. Since b A − b P , b m and the combination Z ≡ Z m Z P /Z A are short distance quantities, they can be determined in small physical volumes with Schrödinger functional boundary conditions. This allows for simulations with degenerate sea quarks lying very close to the chiral limit, so that terms containing Tr[M q ] can be dropped in eq. (2.10). Following the strategy already proposed in refs. [11,13], we introduce two valence quark flavours with subtracted masses m q,1 < m q,2 and their average m q,3 ≡ (m q,1 + m q,2 )/2. It is then straightforward to obtain estimators of the desired improvement coefficients and normalisation factor Z from the ratios where bare PCAC masses m ii (with i = 1, 2, 3) are defined through eq. (2.3) for two degenerate but distinct flavours. Note that the leading improvement coefficients obtained from R AP and R m suffer from mass-dependent O(a) effects, which introduce only O(a 2 ) uncertainties in the quark masses; cf. eqs. (2.2) and (2.8). The O(aTr[M q ]) effects appearing on the r.h.s. of eqs. (2.11a), (2.11b) and (2.11c) arise from the presence of a residual non-zero sea quark mass in realistic simulations. This uncertainty is removed once simulations are performed for several sea quark masses and the chiral limit is reached by extrapolation. In our setup we are also able to simulate negative current quark masses, so in practice interpolation to the chiral limit is also possible. In the chiral limit, the leading normalisation factor Z of the estimator R Z suffers from O a 2 effects; this is easily derived from eqs. (2.10) and (2.11c).
The above ratios are not the only possible estimators of the quantities of interest. For example, we can modify R AP and R m by replacing the denominator (m 11 − m 22 ) by any of the following PCAC mass differences: 2(m 22 − m 33 ), 2(m 33 − m 11 ), 2(m 22 − m 12 ), or 2(m 12 − m 11 ). As shown in ref. [22], these new ratios also provide estimates of (b A − b P ), b m and Z, with different finite cutoff effects. In practice these differences were found to be orders of magnitude smaller than other systematic effects. So we have retained the original estimators R AP , R m and R Z in the present work.
As previously stated, improvement coefficients are short distance quantities, which can be determined in small physical volumes, using the Schrödinger functional setup, with L 3 × T lattices having periodic boundary conditions in space and Dirichlet boundary conditions in time. Definitions of boundary operators and related correlation functions are given in appendix B. For reasons explained above, sea quark masses are tuned closely to the chiral limit, in line with the usual ALPHA Collaboration choice of mass-independent renormalisation schemes.

The strategy revisited
In refs. [11,13] the computation of the Symanzik b-coefficients proceeds as follows: the PCAC quark masses m 11 , m 22 , m 12 and m 33 are first determined in standard fashion from eq. (B.3), and are subsequently fed into the ratios R X (with X = AP, m, Z) defined in eqs. (2.11a), (2.11b) and (2.11c). In some cases, this procedure turns out to suffer from numerical instabilities: the numerators of R X are current quark mass differences, i.e., constructed so that the leading contributions in powers of the lattice spacing a cancel, isolating the b-counter-terms. If these delicate cancellations in the mass differences occur with inadequate precision, the signal may be lost to the noise. Moreover, as we decrease the heavier quark mass m q,2 towards m q,1 , striving to reduce discretisation effects, both numerator and denominator of the three estimators R X will contrive to give a noisy signal. We will show in appendix C (cf. figure 10) examples of this instability. In order to overcome this problem, we introduce here a more elaborate method of analysis of quark masses evaluated from correlator measurements, which should ameliorate the stability of the results.
We start with some general considerations. At fixed gauge coupling, the bare PCAC quark masses m ij defined in eq. (2.3) depend on the subtracted valence quark masses m q,i and m q,j and the trace of the sea quark mass matrix Tr[M q ]. Since in our simulations sea quark masses are very close to the chiral limit but not strictly zero, in what follows we keep Tr[M q ] terms in the equations. The current masses m ij are symmetric functions under the exchange m q,i ↔ m q,j . This implies that they can be expressed as a power series of the form with the mass-splitting denoted as ∆ ij ≡ 1 2 (m q,i − m q,j ) and the dimensionless coefficients as C nk . The latter only depend on the gauge coupling and flavour-blind traces of the sea quark masses.
To next-to-leading order in the lattice spacing, the Symanzik expansion for this expression is given by eq. (2.10). A comparison with eq. (2.12) shows that, to this order, the expansion coefficients read (2.13a) If the sea quark masses were tuned exactly to their critical value, we would have C 00 = 0 and C 01 = Z. Moreover, all C nk would be functions of only g 2 0 in perturbation theory. A non-perturbative determination of the C nk , however, depends on the imposed improvement condition.
Turning next to the specific case under study, we recall that the ratios R X require three quark masses. We define the lightest one to be m q,1 and set it to the value of the three degenerate sea quark masses used in our simulations, thus having m q,1 = Tr[M q ]/N f . In practice, its value is very small, but not strictly zero within statistical errors, cf. the x-axis of figure 3. The heavier mass is m q,2 and the average of the two is m q,3 , i.e., we have m q,2 > m q,3 > m q,1 . Starting from eq. (2.12), we now write the current quark masses m 11 , m 22 and m 12 as power series, with m 22 and m 12 re-expressed in terms of m q,1 and the (partially-quenched) mass-splitting ∆ ≡ 1 2 (m q,2 − m q,1 ) = m q,3 − m q,1 : We observe that their first derivatives are exactly related via 1 2 By construction, the unitary setup is recovered when ∆ → 0, in which case m 12 , m 22 → m 11 . Written in this way, we can always expand the masses am 12 and am 22 close to am 11 , where then ∆ becomes the expansion parameter: The coefficients N i and D i for the flavour non-diagonal and diagonal masses, respectively, are linear combinations of the C nk and carry a residual dependence on am q,1 . Our particular choice of the third mass, m q,3 ≡ m q,12 , leads to the identity m 33 (∆) ≡ m 22 (∆/2). Accordingly, we have the expansion at our disposal and can revisit eqs. (2.11) in the context of our current discussion: The original estimator R Z was constructed in order to cancel an O(a∆) effect [11], i.e., to reduce the largest bias in the determination of the leading order factor Z. In the unquenched theory, uncancelled O(aTr[M q ]) effects remain in all quantities. They are typically supressed by 1 -2 orders in magnitude due to the sea quark mass tuning (m 11 ≈ 0) required in a mass-independent renormalisation scheme. In the determination of the estimators R X , a renormalised trajectory, or line of constant physics (LCP), has to be employed such that they adopt the proper scaling behaviour when the bare gauge coupling g 2 0 is varied. This means that, besides m 11 = 0, a value for the mass-splitting ∆, which fixes the LCP in the valence sector, has to be specified. In principle, any sensible choice ∆ = 0 is sufficient to define a valid set {R AP , R m , R Z } ∆ that achieves O(a) improvement in physical quantities. Different choices lead to somewhat different approaches to the continuum limit and are equivalent in the framework of Symanzik's effective theory. Their relative difference is a higher-order cutoff effect that vanishes for a → 0 as can be easily seen in eqs. (2.21). A non-perturbative determination of these estimators inherits a non-trivial all-order dependence on a∆ if the limit ∆ → 0 is not taken. In that sense, an explicit choice of ∆ constitutes an ambiguity in their definition. In ref. [13], for instance, ∆ has been held constant by requiring Lm 22 ≈ 0.5 at constant physical L with L/a ∈ [12,24].
In the present paper we aim at eliminating this ∆-ambiguity in the definition of R AP , R m and R Z , because it can potentially lead to larger cutoff effects in the physics of light quarks. By noting that eqs. (2.18) and (2.19) for am ij can literally be used as joint ansatz for interpolating fit functions (polynomials in a∆), we are able to build the standard estimators according to eqs. (2.21) by dropping the O(a∆) terms explicitly. In this case only the first few parameters of the polynomials are relevant, which can be well controlled by sufficiently scanning the diagonal and non-diagonal current quark masses as functions of a∆. The sub-leading effects in the sea quark mass of order aTr[M q ] can be removed by extrapolation or interpolation. The presented proposal has the additional advantage that no iterative tuning of the second (and subsequently third) mass is required in advance.
We will refer to the results of this analysis as LCP-0, since they are obtained along the line of constant physics which keeps all the masses equal to zero: Here we have introduced the current quark mass difference which is in one-to-one correspondence with the difference of bare subtracted quark masses a∆ and reduces to Lm 22 in the chiral limit, m 11 = 0. Besides the determinations in the massless unitary setup, we will also give results for massive valence quarks. In fact, experience shows that in large-volume simulations, heavy-flavour Wilson quarks have sizeable mass-dependent cutoff effects in the typical range of lattice spacings 0.04 a/fm 0.1. For that reason one may favour the opposite interpretation and exploit the freedom in Symanzik's effective theory to determine the improvement functions R X at a value of ∆ that is as close as possible to the characteristic heavy quark mass scale typically involved in the application in question. By doing so, the interpolating functions for the PCAC masses, eqs. (2.18)-(2.20), have to be evaluated at ∆ = 0 and fed into the defining expressions for the estimators. At the non-perturbative level, this corresponds to a resummation of all higher-order terms in a∆ for the chosen  Overview of the simulation parameters of the N f = 3 ensembles (labeled by ID) that represent our data. Subsequent columns refer to the lattice dimensions L 3 T /a 4 , the inverse gauge coupling β = 6/g 2 0 , the light (sea) quark hopping parameter κ 1 , the number of replica N r , the number of configurations per ensemble, both in total (N cfg ) and in the subset of configurations with zero topological charge (N (0) cfg ), and the corresponding PCAC sea quark masses. All ensembles have configurations separated by 8 molecular dynamic units (MDU), except for A1k3 and D1k4 that have 4 and 16 MDU, respectively. Compared to the data base of [17,18], we have generated and used the nearly chiral ensembles A1k3, A1k4, B1k4 and D1k4, and significantly increased statistics for E1k1 and E1k2.
line of constant physics. The effectiveness of this approach was demonstrated in ref. [13], where two determinations of R X at Lm 22 ≈ 0.5 and Lm 22 ≈ 2.5 were probed in the heavy quark sector with masses above and below the bottom quark mass, finding a more significant reduction of mass-dependent cutoff effects and an extension of the a 2 -scaling region in the case of the largest ∆. Therefore, we also introduce a second line of constants physics in the valence sector, LCP-1: Thus we will determine a second set of estimators R X , suitable for calculations with 2 + 1 dynamical light quarks and valence charm quarks. In sect. 4 we will elaborate on both variants of the data analysis and the achieved control over the systematic effects.

Gauge configuration ensembles
The three-flavour lattice QCD simulations in the Schrödinger functional framework have been performed using the openQCD code of ref. [27], with tree-level Symanzik-improved gauge action [14], N f = 3 massless Wilson-clover fermions, vanishing boundary gauge fields C = C = 0 and boundary fermion parameter θ = 0. The value of the improvement coefficient c sw is taken from ref. [28]. The RHMC algorithm [29][30][31] is used for the third dynamical quark. The relevant modification of the integration measure of the fermion determinant is then compensated by the inclusion of a reweighting factor in the analysis.
Most of the ensembles in this study coincide with those of refs. [17,18], where the improvement coefficient c A and the normalisation constant Z A of the axial vector current are determined. In these works the constant physics condition is fixed by setting L ≈ 1.2 fm. This is achieved by beginning with a particular pair of g 2 0 and L/a (β = 6/g 2 0 = 3.3 at L/a = 12 here). Next we chose the bare couplings for subsequent smaller lattice spacings according to the universal two-loop β-function. In this way, lattice spacings are covered in the range from a ≈ 0.09 fm to a ≈ 0.045 fm. At each bare coupling, we generate ensembles for a few small values of the bare sea current quark mass am 11 , in order to obtain an estimate of the critical point κ crit and to be able to extrapolate to am 11 = 0 at a subsequent stage of the analysis. Table 1 gives an overview of the ensembles used in this work. The labelling of these ensembles, based on an alphanumeric four-symbol code such as A1k1, follows the conventions: the first letter (A-E) represents a specific lattice geometry L 3 T /a 4 , while different choices of β for a given geometry are distinguished by the subsequent number. In the present work, we have a single β for each geometry. Separated by a "k", the final integer labels the sea quark hopping parameter κ 1 = κ sea . In addition to the ensembles available to us from previous ALPHA Collaboration simulations [17,18], we have generated ensembles A1k3, A1k4, B1k4, D1k2 and D1k4, with κ sea tuned so that the corresponding PCAC masses are closer to the chiral limit. Furthermore, the replica lengths of ensembles E1k1 and E1k2 were increased for larger statistics and a more reliable estimation of autocorrelations.
Note that the values of β are in the same range as those of the large-volume ensembles produced with the same lattice action by the CLS (Coordinated Lattice Simulations) effort [2][3][4]. Therefore, our results can be applied in, e.g., determinations of O(a) improved phenomenological quantities such as quark masses and decay constants.

Topological charge
In QCD with Schrödinger functional boundary conditions, disconnected topological sectors emerge in the continuum limit. However, for small or intermediate physical volumes as employed here, non-trivial topological sectors, i.e., those with topological charge Q = 0, only receive a small weight in the partition sum. This issue of topology freezing, previously investigated in refs. [17,18,32,33], may be met by projecting the quantities of interest to the Q = 0 topological sector. For the case at hand, involving improvement coefficients and renormalisation constants, quantities defined in this way differ from their full-topology counterparts only by irrelevant cutoff effects and exhibit a smooth approach to the continuum limit. Figure 1 shows Monte Carlo histories of the topological charge Q and its distribution on three exemplary ensembles. The effect of topology freezing is clearly visible: while the topological charge is appropriately sampled for the coarse lattice spacing in the A1 ensembles (top), the HMC algorithm is not able to properly tunnel between different topological charge sectors at finer lattice spacings (L ≈ const); this becomes most pronounced for the D-ensembles (bottom). Sectors with Q = −1, 0, 1 are mostly sampled, and the charge re- mains for a longer Monte Carlo time in single sectors when the lattice spacing is decreased. Such a behaviour is in line with similar findings, e.g., in [17][18][19]. As in these references, we thus have confined the analysis to the sector with zero topological charge, in order to avoid potential bias from improper sampling and associated, unresolved large autocorrelation times that could affect a reliable statistical error estimation for our observables. Note that this procedure is also theoretically sound, since our strategy to extract b m , b A − b P and Z relies on PCAC quark masses defined through Ward identities which, being operator relations, hold also within a single topological sector. Although this projection to zero topological charge typically comes at the expense of larger statistical uncertainties and a slightly modified cutoff dependence, it is not expected to induce a noticeable difference in the final results. This is indeed confirmed in tables 2 and 3.
The projection of an observable O onto the sector of trivial (Q = 0) topology was introduced in [32,33] via where δ Q,0 is replaced by step functions, because the topological charge takes non-integer values in finite volume. We adopt the charge defined via the gradient flow as in ref. [34], at gradient flow time corresponding to a smoothing ratio of c ≡ √ 8t/L = 0.35. For comparison, we also quote the results for the analysis including all topological sectors.
Further details pertinent to our ensembles can be found in refs. [17,18], which report N f = 3 calculations of c A and Z A . These concern the implementation of the line of constant physics, the negligible influence of its small violations on the results, the simulation algorithm used to generate the gauge configurations, and the projection onto the trivial-topology sector.

Statistical error analysis
The statistical uncertainties are determined using the Γ-method [35,36], so as to take the autocorrelations of all observables into account. An independent analysis, using jackknife error estimation, was done as follows. First the replica of an ensemble are concatenated and subsequently subdivided into bins of width ten. Then, the standard jackknife error is computed by eliminating a single bin average at a time. The bin width was specified by varying the bin size and choosing the minimal value at which the jackknife error stabilises. Error estimates from both methods are in very good agreement.
The chiral extrapolations discussed in the next section are based on independent datasets of ensembles belonging to the same group (e.g. of A1k1, A1k3 and A1k4 belonging to the A1-group). In the context of the jackknife error analysis, we exploit the embedding trick for combining statistically independent runs, described in appendix A.3 of [37].

Data analysis
The definitions of the Schrödinger functional (SF) correlation functions f A , f P and the PCAC quark masses m ij are standard; therefore, we defer all details to appendix B.
For each ensemble we compute valence quark propagators for m q,1 (which is fixed by the sea quark hopping parameter of the simulation) and for O (15)  the range 0 ≤ L∆ 22 ≤ 1, as well as for the corresponding m q,3 ≡ 1 2 (m q,1 + m q,2 ). Earlier approaches [11,13] relied upon the three distinct masses, in order to evaluate the estimators R X by direct use of eqs. (2.11). In the strategy adopted in the present work, m 33 is simply another current quark mass diagonal in flavour, so it is on an equal footing with m 22 , thereby enriching the density of points in the low mass region. Results from the earlier method and the issues related to it are discussed in appendix C.
The SF correlation functions for the appropriate mass combinations are obtained be utilizing the sfcf program [38].
We compute the PCAC masses m 11 , m 22 , m 12 and m 33 for each time-slice x 0 using the improved lattice derivatives of eq. (B.4). Then we average over the middle third of the time extent T = (3/2)L, i.e., x 0 /a ∈ [L/(2a), L/a] (keeping the physical plateau length constant); an example is shown in figure 9. The more standard choice of the mass definition at x 0 = T /2 with standard derivatives has been taken into account for comparison.

Interpolating functions for PCAC quark masses
We proceed by applying the method described in subsect.  N 1 , N 2 , . . ., D 2 , . . .), constrained to have the same intercept and related first derivatives. In order to avoid over-constraining our fits, we prefer treating am 11 as a free parameter, rather then keeping it fixed to its measured mean value.
We have opted for third-order polynomials. From eqs. (2.21a)-(2.21c) we see that for the determination of the estimators at the unitary point (i.e., a∆ = 0) we only need the polynomial coefficients up to second order. By fitting with third-order polynomials we take into account possible higher-order effects, without contaminating the lower order coefficients. The dependence of the results on the polynomial order is investigated in subsect. 4.3.2.
In the two upper panels of figure 2 we display the difference between the interpolating fit and the data points for the diagonal and non-diagonal masses, respectively. Comparing the deviation of the mean values from zero, we see that the combined fit is an excellent representation of the data down to 10 −5 .

Estimators from the interpolation method
Having determined the fit polynomials of non-diagonal and diagonal PCAC masses, we evaluate the estimators R X at the two lines of constant physics, LCP-0 and LCP-1, specified by eqs. (2.22) and (2.24).
For the unitary case LCP-0 with a∆ = 0 (or, equivalently, L∆ 22 = 0), these estimators are built from the parameters N 1 , N 2 and D 2 according to eqs. (2.21). We gather these results for all ensembles in tables 5 and 6 of appendix D.
The next step towards a fully massless calculation is to extrapolate the results to the     ) have an implicit dependence on am 11 , which only affects O(a 2 ) terms of these expansions. Thus a linear fit appears to be sufficient for our purposes. An example is reproduced for ensemble group B1 (β = 3.512) in figure 3, where chiral interpolations of R Z are presented. 4 Results are listed in table 2, for all topological sectors and after projecting to Q = 0. We repeat the full analysis for LCP-1, i.e., L∆ 22 = 1. The errors of the respective estimators are typically smaller than in the unitary case LCP-0, cf. tables 7 and 8 of appendix D. The interpolation to vanishing sea quark mass m 11 = 0 is also illustrated in figure 3 for R Z of the B1 ensembles, where one can infer that the difference of the two chiral (red) points is statistically significant and represents an a∆-ambiguity. The results for all chiral estimators of LCP-1 are given in table 3.

Ambiguity checks
Before presenting the final results, we discuss the ambiguities arising from our specific choices of improvement conditions. These consist in: the projection to topological sectors, the exact definition of the current quark masses, and the interpolating functions that relate them to the mass difference a∆. All these choices are formulated in a way that respects the constant physics condition among different ensembles. They are part of the non-perturbative operational definitions of the R X , which influence the numerical values of our final results. We will present below some representative examples of these systematic effects. They are found similar in size to those previously observed in the quenched [11] and in the two-flavour [13] determinations.

Standard vs. improved derivatives
In figure 4 we show the differences between final results obtained using improved ("imp") and standard ("std") lattice derivatives (cf. eq. (B.4)) for R AP and R m in the LCP-0 case. These arise as a consequence of O(a) discretisation effects. ∆ d R m is of the order of the statistical errors. As found in the quenched and two-flavour analyses [7,11,13], the estimator R AP is particularly sensitive to the chosen discretisation of the derivatives, resulting in larger ambiguities. Although fluctuations are present, especially for the largest lattice spacings, the ∆ d R X seem to vanish linearly in the continuum limit as expected, see figure 4.

Degree of the polynomial fits
Our results are obtained by fitting PCAC masses with third-degree polynomials. The polynomial degree introduces a further source of uncertainty, which we investigate by monitoring the quantity for our LCP-0 results, extrapolated to the chiral limit. Figure 5 illustrates the differences in R AP and R Z . Here, the resulting intrinsic ambiguities in R AP and R m are O(a), while in R Z they are O(a 2 ). These effects, which in case of R AP and R Z appear to be barely larger than their statistical errors, vanish at smaller lattice spacings. A qualitatively similar behaviour is observed for R m .

Determination with non-unitary valence quark masses
The polynomial fits allow the determination of the estimators at any valence point in the considered mass range 0 ≤ L∆ 22 ≤ 1. Results at different values of L∆ 22 differ by massdependent cut-off effects. Therefore, we also investigate the difference between the LCP-0 results (L∆ 22 = 0) and two choices of heavy valence quarks, namely those at a point with L∆ 22 = 0.25 and at LCP-1 (L∆ 22 = 1). In figure 6 we plot for R AP and R Z . From the scaling behaviour of these estimators it is evident that the relative size of the cut-off effects grows with the valence quark masses, while the differences themselves decrease significantly towards the continuum limit. (Note that this decrease is actually faster than the expected rates ∝ a/L and ∝ a 2 /L 2 for improvement coefficients and renormalisation factors, respectively.) As seen in figure 6, the difference ∆ m R X at fixed a/L roughly scales with an integer power of the ratio of L∆ 22 , i.e., 1/4 for X = AP and (1/4) 2 for X = Z. We note in passing that there is also an implicit dependence of the results on our choice of mass range 0 ≤ L∆ 22 ≤ 1.

Results
Based on our non-perturbative calculation of the estimators R (0) X listed in tables 2 and 3, we now provide interpolating formulae to make them accessible also at other values of the gauge coupling.
To have at least some constraint towards smaller couplings g 2 0 = 6/β, we opt for interpolating formulae that encompass the one-loop perturbative behaviour as g 2 0 → 0. Since there are no theoretical expectations for the functional forms of the estimators in the region of large couplings, we have probed, with varied degrees of success, many conceivable ansätze. We have settled for the following ones: The numerical constants in the above equations are those dictated by one-loop perturbation theory [39,40]. The other parameters are determined from fits. At the unitary chiral point (corresponding to LCP-0) we obtain:    We have also produced a gauge configuration ensemble at β = 8.0, in order to evaluate the estimators R X in the deeply perturbative region. Even though the physical volume of this ensemble is smaller than the LCP one, these β = 8.0 results qualitatively agree with our fit ansätze and their shape as g 2 0 → 0. This corroborates our interpolating fit functions, which asymptotically approach the one-loop perturbative predictions at small gauge couplings.
We cover the g 2 0 -range typical of large-volume computations of bare quark masses, matrix elements and other phenomenological applications. In particular, the N f = 2 + 1 couplings of the CLS effort in refs. [2][3][4][5]41] are β ∈ {3.85, 3.70, 3.55, 3.46, 3.4}. In figure 7 we indicate the CLS g 2 0 -values as vertical dashed lines, and in table 4 we provide our interpolated results at the corresponding β-values. Note that the smallest coupling employed in the CLS simulations, being slightly outside our range β ∈ [3.3, 3.81], can be reached by extrapolation of our interpolating functions. In these cases, near the edges of our β-range, the results are more sensitive to the choice of the specific fit ansatz. The covariance matrices of eqs. (5.3) and (5.5), as well as the statistical errors in table 4, are large enough to cover the results obtained from different fit ansätze we have tried out. Moreover, it can be seen from figure 7 that the error band at the CLS couplings is consistent with the errors of the neighbouring data points. We have verified explicitly that any effect on the error of a typical combination of our estimators coming from the correlations among the R X 's (which in principle would have to be taken into account when, for instance, calculating renormalised quark masses) is negligible compared to the inflated statistical uncertainties of the fits. Nevertheless, for the sake of completeness, we quote in appendix E the correlation matrices among the R X 's.
Our results are compatible with the recent ones by Korcyl and Bali [21] within their considerably larger errors; cf. figure 8. Any differences between the two sets of results at the same coupling g 2 0 are to be attributed to discretisation effects. Finally we note that, since Z S /Z P = (Z m Z P ) −1 = (Z A Z) −1 , the (scale independent) ratio of renormalisation constants Z S /Z P may be obtained by combining R Z from this work with the interpolation formula for Z A from [18,19]. A direct determination of Z S /Z P based on the Ward identity approach is in progress [42].

Conclusions
The present paper is part of a series of publications dedicated to the non-perturbatively O(a) improved quark mass renormalisation in three-flavour lattice QCD with Wilson fermions. It complements previous determinations of the axial current improvement and normalisation [17][18][19] and the renormalisation factor of the pseudoscalar density [16] by a non-perturbative calculation of the improvement coefficients b A − b P and b m -multiplying associated additive, quark mass dependent Symanzik counter-terms -as well as of the normalisation factor Z ≡ Z m Z P /Z A . We work in the framework of lattice QCD with N f = 3 flavours of mass-degenerate, non-perturbatively O(a) improved Wilson-clover sea quarks and tree-level Symanzik-improved gluons.
Our computational setup to determine b A − b P , b m and Z m Z P /Z A consists in small physical volume simulations, with Schrödinger functional boundary conditions, and exploiting the PCAC relation with mass non-degenerate valence quarks. Valence quark masses and lattice volumes have been varied ensuring that we approach the chiral and continuum limits while staying on a line of constant physics. Although we have based our work on an earlier N f = 2 publication [13], we have extended that analysis by introducing a series of novelties as explained in the main part of the paper. The final results obtained refer to massless sea quarks. These can be inferred from tables 2 and 3, together with the formulae (5.1), (5.2) and (5.4), which provide smooth parameterisations of b A − b P , b m and Z m Z P /Z A in terms of the bare gauge coupling squared. Several checks have been performed to address the various systematics involved and to guarantee the stability of the analysis strategy as well as the reliability of the quoted error estimates.
Since our range of couplings matches that of the large-volume CLS simulations [2][3][4][5], our results are currently being applied in a (2 + 1)-flavour computation of light, strange and charm quark masses (see [43,44] for a preliminary account). They are also useful in the computation of other physical quantities, such as certain combinations of QCD matrix elements involving the axial current and the pseudoscalar density. Our results for b A − b P , b m , and Z for the specific bare couplings of the CLS ensembles are collected in table 4.

A Non-unitary QCD and improvement coefficients
In this appendix we will further discuss how the key expressions of sect. 2 are modified in the non-unitary version of lattice QCD with sea quarks with masses m q,i (i = 1, . . . , N f ) and valence quarks with masses m val q,i (i = 1, . . . , N val ). It is understood that both sea and valence lattice fermion actions are regularised à la Wilson. In general N val = N f and sea and valence quark masses are unequal. The chiral limit κ crit is the one defined, in some standard fashion, in the unitary theory of N f quarks. All subtracted sea quark masses are defined so as to vanish at κ crit .
The renormalisation and improvement pattern of each valence quark mass [m val i ] R depends on the sea quark mass matrix Tr[M q ] and the bare quark mass m val q,i of this very flavour. (With some standard renormalisation condition imposed for the quark mass, there is no physical reason for a dependence on a valence quark mass of a different valence flavour.) This leads to the expression To the order we are working in the lattice spacing, the coefficients k m , h m ,h m , j m and j m depend on the bare coupling g 2 0 only; any mass dependence is an O(am q ) discretisation effect. If we drive the valence bare quark mass to the value of the corresponding sea quark mass (i.e., κ val i = κ i ), the above expression should reduce to eq. (2.2). This implies the identification We then drive the "non-unitary QCD" formulation to the unitary QCD one: This means that N val = N f , and for each flavour the valence quark mass is equal to that of the sea. The above expression reduces to eq. (25) of ref. [26]. Similarly we obtain which is eq. (24) of ref. [26] when the non-unitary formulation is driven to the unitary one.

B Schrödinger functional correlation functions
Summed over the spatial volume, these yield pseudoscalar boundary sources projected onto zero momentum. From these, the x 0 = 0 boundary-to-bulk forward Schrödinger functional (SF) correlation functions in the pseudoscalar channel are constructed from the axial current and density as Flavour indices i, j are not summed over, and when i = j they denote degenerate but distinct flavours. With O ji replaced by O ji , we also have the x 0 = T boundary-to-bulk backward SF correlation functions g ij A,P (T − x 0 ). In a vanishing background field, they are related to f ij A,P (x 0 ) by time reflection and are averaged in order to reduce the statistical noise.
In our SF framework, the bare PCAC quark masses of eq. (2.3) are given by: where we explicitly indicate their additional dependence on L/a, T /L and the periodicity angle θ in the boundary conditions of the fermion fields. These dependences will usually be implicit, in order to keep the notation simple. In the degenerate case (i = j), m ij reduces to the non-singlet PCAC mass of a flavour degenerate doublet. The first and second lattice derivatives ∂ 0 and ∂ * 0 ∂ 0 in the last equation, upon acting on smooth functions, are the continuum ones up to terms of O a 2 and O a , respectively. Following refs. [7,11], besides using these derivatives, we have computed current quark masses involving derivatives obtained with the replacements Upon acting on smooth functions, these derivatives are the continuum ones up to terms of O a 4 ; thus they are "improved" as far as their discretisation effects are concerned. It is hoped that, when used in the definition of m ij , the resulting estimates of b m , b A − b P , and Z will show milder discretisation effects. This is of course not guaranteed, as other terms of O(a 2 ) from the correlation functions remain uncancelled. C Results from the method of refs. [11,13] In refs. [11,13], estimates for R X were obtained in the quenched and two-flavour cases respectively, using a different method to analyse the data. Here we compare results from this method to those obtained from our analysis. In ref. [13] the heavier mass was tuned to a single value, Lm 22 ≈ 0.50. This was small enough to ensure small discretisation effects and large enough to keep statistical uncertainties under control. Thus, the results quoted in ref. [13] for b A − b P , b m and Z contain O(am 22 ) effects as part of their non-perturbative definition. In the present work, with several Lm 22 values at our disposal, we can extrapolate first to the unitary point m 22 → m 11 and then to the chiral limit m 11 → 0.
In the spirit of refs. [11,13] the current masses m 11 , m 12 , m 22 and m 33 are computed at each time-slice x 0 and fed into the definitions of the estimators R AP , R m and R Z ; cf. eqs. (2.11a)-(2.11c). These, in theory, should also display plateaux as functions of x 0 , being functions of the current quark masses. However, as seen in figure 9, this is not usually the case. As anticipated in subsect. 2.2, the problem arises from numerical instabilities owing to the subtlety in the cancellation of nearly equal masses (such as 2m 12 and m 11 + m 22 in R AP ). We conclude that a change of strategy is required, in order to obtain stable results when approaching unitarity. In figure 10, the continuous band for R m , based on the polynomial fits to the PCAC masses, is shown for comparison. Results from both determinations agree for large valence quark masses. Close to the unitary point, where the older method fails, the polynomial fits give a safe R m -estimate. A similar behaviour is seen for the other estimators.

D Results at m 11 = 0
Our results for current quark masses, R AP , R m and R Z are listed in tables (5)-(8) for each configuration ensemble.

E Correlations between the observables
Since our final observables b A − b P , b m and Z are determined on the same ensembles, we expect them to be correlated. In table 9 and figure 11, we give estimators for these correlations, i.e.,