Lattice QCD at the physical point: Simulation and analysis details

We give details of our precise determination of the light quark masses m_{ud}=(m_u+m_d)/2 and m_s in 2+1 flavor QCD, with simulated pion masses down to 120 MeV, at five lattice spacings, and in large volumes. The details concern the action and algorithm employed, the HMC force with HEX smeared clover fermions, the choice of the scale setting procedure and of the input masses. After an overview of the simulation parameters, extensive checks of algorithmic stability, autocorrelation and (practical) ergodicity are reported. To corroborate the good scaling properties of our action, explicit tests of the scaling of hadron masses in N_f=3 QCD are carried out. Details of how we control finite volume effects through dedicated finite volume scaling runs are reported. To check consistency with SU(2) Chiral Perturbation Theory the behavior of M_\pi^2/m_{ud} and F_\pi as a function of m_{ud} is investigated. Details of how we use the RI/MOM procedure with a separate continuum limit of the running of the scalar density R_S(\mu,\mu') are given. This procedure is shown to reproduce the known value of r_0m_s in quenched QCD. Input from dispersion theory is used to split our value of m_{ud} into separate values of m_u and m_d. Finally, our procedure to quantify both systematic and statistical uncertainties is discussed.


Introduction
The goal of this paper is to give technical details of our calculation of the average light quark mass m ud = (m u +m d )/2 and of the strange quark mass m s at the physical mass point and in the continuum [1]. This calculation is from first principles and sets new standards in terms of controlling all systematic aspects of a direct calculation of quark masses. Because the values m u and m d are also of fundamental importance, we determine them by combining our results for m ud and m s with dispersive information based on η → 3π decays. A summary of recent determinations of light quark masses in N f = 2 and N f = 2+1 QCD is found in [1].
Let us begin by stating the minimum requirements for a first-principles determination of m ud and m s with fully controlled systematics: 1. The action should belong to the universality class of QCD according to standard arguments, based on locality and unitarity, and an exact algorithm should be used. 2. The light quark masses should be sufficiently close to their physical values such that an extrapolation, if necessary, can be performed without adding non-trivial assumptions. Our simulations are performed "at the physical mass point", i.e. with values of M π and M K which bracket the physical values; this eliminates the need for a "chiral extrapolation". 3. Simulations should be performed in volumes large enough to ensure that finite-volume effects are well under control (we use box sizes up to L ≃ 6 fm). 4. Simulations should be performed at no less than three lattice spacings a to make sure that a controlled extrapolation of all results to the continuum, a → 0, can be performed. 5. All renormalizations should be performed nonperturbatively, and the final result should be given in a scheme which is well-defined beyond perturbation theory (we will give our results in the RI/MOM scheme). 6. The scale and other input masses should be set by quantities whose relation to experiment are direct and transparent (we use the masses of the Ω baryon, the pion and the kaon).
The present work contains additional innovative features which are not required to give an ab-initio result, but help to keep all systematic errors small: 7. We devise a method, tailored to needs of studies with Wilson-like fermions, to reconstruct the renormalized quark masses m ud and m s from the much simpler renormalization of the quantities m s /m ud and m s −m ud . We call it the "ratio-difference method". 8. We propose an approach which overcomes the RI/MOM "window-problem". It is based on taking a separate continuum limit of the evolution function R S (µ, µ ′ ) of the scalar density S from a scale µ ∼ 2 GeV, where the RI/MOM procedure yields reliable results, to a scale µ ′ ∼ 4 GeV where one may make (controlled) contact with perturbation theory. 9. We use an advanced analysis procedure to assess the size of both the statistical and the systematic uncertainties (the same one as in [2]).  Figure 1: Summary of our simulation points. The pion masses and the spatial sizes of the lattices are shown for our five lattice spacings. The percentage labels indicate regions, in which the expected finite volume effect [3] on M π is larger than 1%, 0.3% and 0.1%, respectively. In our runs this effect is smaller than about 0.5%, but we still correct for this tiny effect.
In our view, item 2 marks the beginning of a new era in numerical lattice QCD, because it avoids an extrapolation in quark masses which, generically, requires strong assumptions, thus relinquishing the first-principles approach (see the discussion in [1]).
To give the reader an overview of where we are in terms of simulated pion masses M π and spatial box sizes L, a graphical survey of (some of) our simulation points is provided in Fig. 1 (with more details given in Sec. 5). We have data at 5 lattice spacings in the range 0.054 − 0.116 fm, with pion masses down to ∼ 120 MeV and box sizes up to ∼ 6 fm. Comparison with Chiral Perturbation suggests that our finite volume effects are typically below 0.5%, and close to the physical mass point (which is the most relevant part) even smaller. Still, we correct for them by means of Chiral Perturbation Theory [3], and test the correctness of this prediction through explicit finite volume scaling runs (see below).
The remainder of this paper is organized as follows. In Sec. 2 details are given concerning the action and algorithm employed, while Sec. 3 specifies how one determines the HMC force with HEX smeared clover fermions. Our choice of the scale setting procedure and of the input masses is discussed in Sec. 4, with simulation parameters tabulated in Sec. 5. Checks of algorithmic stability are summarized in Sec. 6, while autocorrelation and (practical) ergodicity issues are reported in Sec. 7. To corroborate the good scaling properties of our action, explicit tests of the scaling of hadron masses in N f = 3 QCD are carried out, see Sec. 8. Details of how we control finite volume effects through dedicated finite volume scaling runs are reported in Sec. 9. To test consistency with SU(2) Chiral Perturbation Theory the behavior of M 2 π /m ud and F π as a function of m ud is investigated in Sec. 10. Details of how we use the RI/MOM procedure with a separate continuum limit of the running of the scalar density R S (µ, µ ′ ) are given in Sec. 11. To show the reliability of this procedure the known value of r 0 m s in quenched QCD is reproduced, see Sec 12. In Sec. 13 it is discussed how one may use input from dispersion theory to split our value of m ud into separate values of m u and m d . Sec. 14 specifies our procedure to quantify both systematic and statistical uncertainties. Our final result is summarized in Sec. 15.
As we use the hybrid Monte-Carlo (HMC) algorithm [10], a non-trivial ingredient with this action is the coding of the molecular dynamics (MD) force, which will be discussed in the next section. The MD updates are performed in quadruple precision, to ensure exact reversibility in our target (double precision) accuracy. Further particulars of our implementation -evenodd preconditioning [11], multiple time-scale integration ("Sexton-Weingarten scheme") [12], mass preconditioning ("Hasenbusch trick") [13], Omelyan integrator [14], RHMC acceleration with multiple pseudofermions [15] and mixed-precision solver [9] -have been described in [9]. As has been noted in the literature [16,17], combining several of these ingredients yields a dramatic reduction of the critical slowing down that has traditionally been observed for light quark masses. As we show in this paper, the thorough combination of all these ingredients allows for simulations directly at the physical mass point, in large volumes, with several lattice spacings, such that a controlled extrapolation to the continuum can be performed.

HMC force with smeared links
We use two steps of HEX smearing [8] in our fermion action S f , both for the Wilson and for the clover term. Our S f depends on the thin (unsmeared) links only through the smeared links where V (n) denotes the HEX smeared links, with n indicating the smearing level. Generically, the fermionic contribution to the HMC force is given by the gauge derivative δS f /δU. In order to obtain the derivative of S f with respect to U for our two-step smearing recipe, we will apply the chain rule twice, which leads us to the following scheme. First one calculates which encodes the details of how the fermions are coupled to the smeared gauge fields. This part of the calculation is not related to the smearing, one just takes δS f [U]/δU and replaces U → V (2) . The main consequence of the nested dependence (6) is the recursion formula where the proper definition of the star-product and of the second term will be given below. Thinking in terms of routines of the computer code, one such step takes the previous derivative δS f /δV (n) and the links V (n) , V (n−1) as input and yields the next derivative δS f /δV (n−1) . This procedure needs to be called n times, at the end we obtain the final fermion force Since the extension to a second and possibly more steps is straightforward, we will only consider the derivation for one level of HEX smearing in the following. We now specify the two main ingredients in the derivation of the HMC force for fat-link actions, that is the gauge derivative and the pertinent chain rule. Since an SU(3) matrix in the fundamental representation is a complex 3 × 3 matrix with only 8 independent degrees of freedom, it is a structured matrix and the derivative has to be defined properly.
The Lie algebra. For U ∈ SU(3) an infinitesimal change can be written as with real parameters u A and the set {T A |A = 1 . . . 8} forming a basis in the space of the traceless, anti-hermitian matrices, i.e. of the tangent space of the group. These matrices are normalized through Tr(T A T B ) = −δ AB . Using the trace, one can define a scalar product on this vector space; for X = A x A T A and Y = A y A T A in the Lie algebra the scalar product allows one to build a projector which restricts an arbitrary 3×3 matrix, M, to the tangent space (4). Furthermore, it is easy to show that for arbitrary matrices M and N Re Tr(P TA (M)N) = Re Tr(P TA (N)M) .
Complex valued functions. Let f be a complex valued function on the group SU(3). Its derivative with respect to the group element is a vector in the Lie algebra where the components are defined as Throughout, the partial derivative with respect to U under the trace is to be understood as a derivative with respect to unconstrained matrix elements. In particular, this means that If f depends on the adjoint matrix U † , then using the identity U † = U −1 this dependence is converted into a dependence on U, with the consequence that Real valued functions. For a real valued function f the group derivative takes the form is a mapping on the group space. If U changes as in (10), with small parameters u A = O(ǫ), then V will also undergo a small change, which may be written as where the small parameters v B = O(ǫ) are real-valued functions of the original parameters u A . Below we shall need the u A derivative of V ′ , that is which, upon taking the limit u A → 0, implies Chain rule. Let the function f depend on U only via V , and let us calculate its derivative with respect to U. Again, U transforms with infinitesimal parameters u A , resulting in an infinitesimal change of the variables v B = v B (u) of V . The usual chain rule yields and, after taking the limit u A → 0, we arrive at With (11) and the definition of the gauge derivative from (13,20), this may be rewritten as This formula is the chain rule for the gauge derivative, which can formally be stated as With these ingredients we can now specify the HMC force for a fermion action with one step of HEX smearing. In the following we will simplify our notation by replacing V (i,n) with V (i) . One HEX smearing, U µ → V (3) µ , is built from three substeps (cf. Eq. 3) where P TA has been defined in (4) and C (1) , C (2) , C (3) represent staples constructed via with the factors α i /(2(d−1)) included (for reasons to become obvious below). In the following we will drop Lorentz indices and space-time arguments for simplicity. The task is to calculate the HMC force [with V = V (3) ] where δS f /δV is already known. The A component of the star product reads [see (23)] where Σ (3) contains the part of the force which is already known Since S f is a real-valued function we can write this in the compact form To improve readability, let us temporarily denote V (3) by V . The last substep is then V = exp(A)U, where A = P TA (CU † ) and thus The derivative of the exponential of a traceless anti-hermitian A reads [see Eq. (68) of [7]] with B 1,2 being second-order polynomials of A and f 1,2 complex constants which depend on the trace and determinant of A. Using the cyclicity of the trace in the color indices, we arrive at (33) where the second term contains the derivative of A = P TA (CU † ). We use the identity valid for arbitrary M and N [see (12)] to shuffle the projector in A = P TA (CU † ) to the matrix in square brackets in (33). Next, we use that the derivative of CU † can be written as and introduce a convenient notation for the expression in square brackets in (33) by means of With this at hand, and using the relation exp(A) = V U † , we arrive at the compact expression Finally, we reintroduce the superscript (3) and note that the U dependence of C (3) comes exclusively from V (2) . With these adjustments, relation (37) takes the form where Σ (2) is defined as meaning that the term Σ (2) can be calculated in the similar way as Σ (3) . This procedure can be iterated, and we find for an action with one step of HEX smearing where Σ (i) is defined as The object ∂C (i+1) /∂V (i) is a straightforward staple derivative, where some care needs to be taken w.r.t. the Lorentz indices. With this formula, one may implement a routine which calculates the HMC force for a fermion action with one step of HEX smearing. The extension to a second smearing step is realized through a second call to this routine as shown in (6). This calculation of the HMC force with HEX smeared fermion actions extends the results of [7,18]. An early treatise of the HMC force for fat-link fermion actions is found in [19].

Scale setting and input masses
To set the scale and to adjust the light and strange quark masses m ud = (m u + m d )/2 and m s to their physical values, we need to identify three quantities which can be precisely computed on the lattice and measured in experiment. We will use the mass of the Ω baryon and of the pseudoscalar mesons π, K for this purpose, in the latter case with a small correction for isospinbreaking and electromagnetic effects (see below).
In other words, at the point where M π /M Ω and M K /M Ω agree with the physical values of these ratios, the measured value of aM Ω is identified with the lattice spacing times 1.672 GeV [20], and this yields a −1 . In [2] we used also the Ξ baryon to set the scale. As discussed there, correlation functions of spin 3/2 baryons are somewhat noisier than those of spin 1/2 baryons. On the other hand, the more light valence quarks present in a baryon, the larger the fluctuations. In [2] with M π down to 190 MeV these two effects balanced each other, rendering the Ω and the Ξ equally good choices. In the present paper we go down to M π = 120 MeV, and the Ω with no light valence quark is the better choice. We use the standard mass-independent scale-setting scheme, in which this lattice spacing is subsequently attributed to all ensembles with the same coupling β = 6/g 2 0 and N f . Our ensembles bracket the physical quark masses m ud and m s in the sense that the span of simulated M 2 π and M 2 ss ≡ 2M 2 K −M 2 π encompass the physical values given below. As a result, it suffices to use a parametrization of aM Ω as a function of (aM π ) 2 and (aM ss ) 2 which describes the entire data set. We find that the Taylor ansatz aM Ω = c 0 +c 1 (aM π ) 2 +c 2 (aM ss ) 2 +c 3 (aM ss ) 4 works perfectly.
Since our lattice simulations are performed in the isospin symmetric limit m d = m u and do not account for electromagnetic interactions, the physical input meson masses must be corrected for these effects. The account of this as given by FLAG [21] is essentially a refined version of the analysis presented by MILC [22] some time ago. The bottom line is that in the framework mentioned above one should use the input masses M π = 134.8(3) MeV, M K = 494.2(5) MeV, which means that the electromagnetically corrected, isospin-averaged pseudoscalar input meson masses essentially agree with the PDG values of M π 0 and 1 2 (M 2 K + +M 2 K 0 ), respectively.

Simulation parameters
An overview of our N f = 2 + 1 QCD simulations is presented in Tab. 1. For each ensemble we indicate the bare parameters, the lattice geometry, and the ensemble length in τ = 1 units (after thermalization). In addition, the pion mass for the given parameters (determined with a specific choice of the fitting interval) is given. Note that the quoted error is only statisticala detailed account of our procedure to keep track of the systematic uncertainties is given in Sec. 14. With Wilson fermions negative bare masses are not disturbing; after renormalization they will evaluate to positive quark masses (see Sec. 11). We work with spatial volumes as large as L 3 ≃ (6 fm) 3 and temporal extents up to T ≃ 8 fm. Besides reducing finite-volume corrections and excited-state contaminations, large (four-dimensional) volumes tend to decrease statistical uncertainties to the same extent as increasing the number of trajectories (in a fixed volume) would do. For instance, in a L 4 box 1300 trajectories at M π L = 4 are approximately equivalent to 4100 trajectories at M π L = 3. With an HMC-type algorithm, the effort (at fixed pion mass) grows like L 5 . Nevertheless, in view of the increased algorithmic stability (see below), choosing large four-dimensional volumes is a beneficial strategy. The integrated autocorrelation times of the smeared plaquette and of the number of conjugate gradient iterations in the HMC accept/reject step are at most O(10) trajectories. Depending on this autocorrelation time, the gauge field after every fifth or every tenth trajectory is stored as a configuration to be used for calculating hadronic observables.   (2), 131(2), 120(2), 182(2), 219(2) MeV.
We put sources for the correlation functions on up to eight timeslices. For the precise form of the meson and baryon interpolating operators see e.g. [23]. To reduce the relative weight of excited states in correlation functions Gaussian sources and sinks are used, with a radius of about 0.32 fm, which was found to be a good choice [2].

Algorithm stability
To detect potential instabilities of the HMC algorithm, different stability tests need to be performed. A rather complete battery of such tests was described in [9]. The pion masses used in this work are considerably smaller than those encountered in [2,9]. For this reason, we repeat the relevant stability tests for the smallest-mass ensemble at each β, in particular for those with the physical pion mass.
With D the Wilson or clover operator, the spectrum of A = D † D has no strict lower bound, i.e. the operator A is positive semi-definite, but not positive definite. If one could integrate the HMC trajectories exactly, this would not cause any problem, since an eigenvalue λ of A approaching zero would introduce an unbounded back-driving force in the HMC, so that the exact zero would be avoided. In practice, the trajectories are generated with a finite step-size integrator. Therefore, a very small λ min in the MD evolution may experience a smaller backdriving force than it would in an exact evolution scheme, and this may trigger an instability.
If a particularly small eigenvalue appears during the molecular dynamics (MD) evolution, the solver in the MD force calculation will require more iterations to arrive at its target precision. More precisely, the inverse of the iteration count N CG is closely related to the smallest eigenvalue of A. In a given ensemble, N −1 CG shows an approximately Gaussian distribution [9]. As long as its median is away from zero by several standard deviations, the simulation is deemed safe [9,24]. In Fig. 2 we show the "worst case scenario", i.e. the situation for the smallest quark masses in our set of ensembles. As one can see, even for pion masses as small as 120−135 MeV the inverse iteration count and hence the spectrum is bounded away from zero.
Alternatively, one can monitor the magnitude of the various contributions to the MD force in the MD evolution. This is done in Fig. 3, where the maximum (over space-time) of each individual contribution to the total MD force during the MD integration is shown over a period of 10, 000 MD steps. As usual, the maximum of each contribution to the MD force fluctuates. However, it is important that the magnitude of these fluctuations is not too large. The more frequent spikes with large magnitude occur at any given MD step-size, the lower is the HMC acceptance rate, to the point where the algorithm becomes unstable. For small pion masses and coarse lattice spacings, the situation becomes even worse. This is why we show the ensemble with our smallest pion mass (around the physical pion mass) at the coarsest lattice spacing in Fig. 3. As one can see there are no dangerously large fluctuations present. Finally, it is good practice to monitor the violation ∆E MD of the MD energy conservation. In Fig. 4 we show again the "worst case scenario", that is the ensemble with our smallest pion mass at the coarsest lattice spacing. As one can see, the typical ∆E MD in this simulation is small. For most of our simulations the acceptance rate is above 90%. Since the acceptance probability is given by p acc = min(1, exp(−∆E MD )), it is reasonable to use the data accumulated in the monitoring of the MD energy violation to check that exp(−∆E MD ) = 1 within errors.
In summary, because (a) our algorithm is free of dangerous fluctuations of the clover eigen-

Autocorrelation and ergodicity checks
A known source of concern about HMC simulations in the regime of light quark masses and/or small lattice spacings is whether the Markov chain manages to sample configuration space sufficiently well, i.e. whether the algorithm is (in practical terms) ergodic [25][26][27]. We monitor two cheap gluonic quantities which are supposed to signal suspicious behavior, if there is any. The first one is the plaquette and/or Symanzik gauge action. With the plaquette it makes sense to consider smeared varieties, too, i.e. Re Tr V plaq where V is a smeared gauge link, as described in Sec. 2. We find integrated autocorrelation times of such quantities to be at most O(10) in units of unit-length (τ = 1) trajectories. The second quantity is the bare is constructed from links which have undergone 10 or 30 steps of HYP smearing [6]. The result for our finest lattice spacing which, according to [25][26][27] represents the worst-case scenario, is shown in Fig. 5. First of all, the 10 HYP and 30 HYP charges are always close to each other. Second, binning them with bin boundaries at Z/2 + 1/4 yields a clear abundance of integer centered bins over half-integer centered bins, and this effect is more pronounced with the more vigorous smearing recipe. Last but not least, the histogram of either charge is reasonably symmetric after about 5000 trajectories, and the integrated autocorrelation time is O(10) trajectories. Since this is the highest β simulated, we see no reason for concerns about the (practical) ergodicity of our simulations.
One should keep in mind that the topological tunneling rate may depend sensitively on the details of the action (e.g. whether Wilson, Symanzik or Iwasaki glue is used, whether smeared or unsmeared links are used in the fermionic part) and on the algorithm (e.g. the number of time scales and the specific choice of Hasenbusch masses).

N f = 3 scaling test for hadron masses
Since the link smearing of the 2HEX action used in the present work differs from the smearing used in [2,9], we decided to repeat the entire scaling test, as presented in [9], in all its detail for the new action.
To this end, we run a number of N f = 3 simulations at various lattice spacings and various M π /M ρ ratios. For each β we interpolate the (common) octet and decuplet baryon mass, i.e. 0.15 fm), with a slight preference for O(a 2 ) over O(αa) scaling, and this suggests that our treelevel value of c SW (see Sec. 2 for the definition and details) is close to the nonperturbative value (which is not known for our action). This finding is in accordance with the results of [8]. Next, the continuum extrapolated values shown in Fig. 6 are in perfect agreement with the continuum extrapolated baryon masses found in [9] with a different action. Last but not least, the slope in either panel of Fig. 6 is small 1 , and an action which shows generically a flat slope in scaling quantities is useful for obtaining precise predictions in the continuum.
In summary we find that both the 6stout action used in [2,9] and the 2HEX action used in the present work exhibit small cut-off effects on standard hadron masses over a broad range of lattice spacings.

Finite volume corrections
For a fixed set of bare parameters, β, m ud , m s , energies and matrix elements of hadronic states depend on the spatial size L of the lattice. Typically, the finite volume tends to increase the effective mass and to decrease the decay constant, relative to their infinite volume counterparts. As a first step it is important to assess by how much such effects would affect the data. For the theoretical treatment of these finite-volume effects it makes a difference whether the state considered is stable under strong interactions (despite the fact that, in a finite volume, the energy spectrum is discrete and all states are stable). The respective frameworks have been established by Lüscher, both for stable [28,29] and unstable [30,31] states. They allow us to quantify and eventually correct for finite-volume effects in a self-consistent manner (i.e. in a way in which only the results of our calculations and axioms of quantum field theory are used).
The structure of these corrections is most transparent for the case of a pion at 1-loop order in Chiral Perturbation Theory (ChPT) [32][33][34]. Up to higher order terms, the relative shift is where the shape functiong 1 (x) has a well behaved expansion in terms of a Bessel function of the second kind, which itself has a large-x expansion of the form implying that finite volume corrections are exponentially suppressed at large L [28]. Higher loop orders for R π (L) have been worked out in [3]. For completeness we mention that analytic results for finite volume corrections of the nucleon are given in [35,36]. The second reference predicts that for physical quark masses and L = 5 fm box size (which is what we use in our smallest box at the physical mass point, the one at β = 3.61, cf. Tab.1) the nucleon experiences just a 2 permil finite-size shift. The point is that ChPT predicts the numerical value of the coefficient "coeff" in (42). In the chiral literature, the low-energy constants that enter "coeff" are pinned down from experiment (at leading order it is F π ).
To avoid using external input, we decide to stay content with just using the functional form of (42). This is permissible, since the shape functiong 1 (x) is just the free Green's function of a massive scalar particle, summed over all spatial mirror copies (due to periodic boundary conditions in the spatial directions) [32], see also the discussion in [3]. We find that we can establish a global fit to all of our data in various volumes if we adjust the free coefficient in (42). A similar conclusion is reached for other stable hadrons, albeit with a different numerical value of the constant. For the case of the pion, we test the fitting ansatz and the analytic prediction [3] by comparing them to dedicated finite-volume scaling runs, as shown in Fig. 7. Both the fit (full line) and the prediction from ChPT [3] (dashed curve) agree with the data. The latter prediction has a limited range, since ChPT becomes questionable in boxes with M π L < 3 [3]. It is important to emphasize that the data with M π L < 4 in Fig. 7 have been generated to test our treatment of finite-volume effects, they do not enter the final analysis. These results confirm our rule of thumb that simulations with M π L ≥ 4 and/or L > ∼ 5 fm yield infinite-volume masses within statistical accuracy. An overview of the expected size of R Mπ in our simulations is given in Fig. 1. In all of these points the mass correction is less than about 5 permil, and for points close to M phys π (which dominate our analysis) it is even smaller. Nevertheless, we include these (tiny) shifts into our global analysis (cf. Sec. 14).

Chiral behavior of pion mass and decay constant
To illustrate the quality of our results obtained in lattice QCD calculations with physical or larger than physical values of the quark mass m ud = (m u +m d )/2, we briefly investigate here whether the m ud dependence of the pion mass and decay constant can be described by ChPT [37,38] in this range of quark masses.
To this end we compare our results for M 2 π and F π versus m ud at fixed (nearly physical) m s (cf. Tab. 1) to the NLO predictions of the SU(2) framework. The latter read [37] with x = M 2 /(4πF ) 2 and M 2 = 2Bm ud a shorthand expression for the light quark mass (up to the factor 2B, with B = Σ/F 2 ). The NNLO expressions can be found in [39]. In all of these expressions F, Σ, B refer to the pion decay constant, the absolute value of the quark condensate and the condensate parameter in the 2-flavor chiral limit m ud → 0 with m s held fixed. The terms in the square brackets proportional to x 0 , x 1 represent the tree-level and 1-loop contributions respectively, and Λ 3 , Λ 4 encode low-energy constants (LECs). In Fig. 8 the quantities M 2 π /m PCAC ud and F π are plotted versus the PCAC quark mass m PCAC ud (see below) at an intermediate lattice spacing (β = 3.5), where we reach down to M π ≃ 130 MeV (cf. Tab. 1). We find that our results with M π < 400 MeV can be jointly fitted with the NLO chiral formulae (45,46), with acceptable χ 2 /d.o.f. and reasonable values of the low-energy constants. However, the extraction of Gasser-Leutwyler coefficients is beyond the scope of this paper and will be left for future publications.

RI/MOM renormalization of quark masses
Our primary goal is to determine the average up-down quark mass m ud = (m u + m d )/2 and the strange quark mass m s at the physical mass point in a "continuum" renormalization scheme, such as RI/MOM or RGI, using first-principles lattice computations.
In a lattice study quark masses and the running coupling have a different status than other observables, such as hadron masses and decay constants, since they are input parameters to the simulation. Consequently, one has to tune these parameters until the low-energy spectrum of the theory agrees with experiment (cf. [2] and Sec. 4), before one may read them off from the results of the simulation. To turn them into observables, one has to convert them from the original cut-off scheme (which is specific to the gluon and quark action combination used) to a scheme where the scale µ is not tied to the lattice spacing a.
The remainder of this section is organized as follows. In 11.1 our "ratio difference method" for extracting quark masses in the theory with Wilson fermions is explained, using standard terminology for the renormalization and improvement coefficients. It is important to notice that in the dynamical theory there is a subtlety in the renormalization pattern, due to quark line disconnected diagrams [40][41][42], but our "ratio difference method" steers around this complication, as explained in 11.2. In subsection 11.3 details of how we determine the flavor non-singlet scalar renormalization constant Z S (µ) via the Rome-Southampton RI/MOM method [43] are given. In 11.4 it is specified how we control the systematics that arise from the dedicated N f = 3 computations needed in the RI/MOM procedure. In 11.5 a summary is given.

Ratio-difference method in a nutshell
With Wilson-type fermions there are two options for obtaining the renormalized quark mass. On the one hand, one may start from the mass parameter am bare , as present in the Lagrangian, and apply both additive and multiplicative renormalization to build the VWI quark mass and VWI means 2 "vector Ward identity". Here Z S = Z S (µ) denotes the lattice-to-continuum renormalization factor of the scalar density (it depends on the chosen scheme and scale, e.g. MS and µ = 2 GeV), b S is an improvement coefficient (see below), and m crit specifies the additive mass renormalization, i.e. the bare quark mass at which the pion becomes massless.
The other way to obtain the renormalized quark mass is as follows. For a pseudoscalar meson made out of valence quarks ψ 1 , ψ 2 with Lagrangian masses m bare i or Wilson masses m W i (i = 1, 2), respectively, the sum of the (unrenormalized) PCAC quark masses is defined as where A µ and P denote the axial current and the pseudoscalar density, respectively, O represents an arbitrary operator which couples to the meson (usually O = P to maximize the signal), and Then only a multiplicative renormalization is needed to form the (renormalized) "AWI quark mass" where AWI stands for "axial Ward identity". Here Z A and Z P = Z P (µ) denote the lattice-tocontinuum renormalization factor of the axial current and the pseudoscalar density, respectively. The coefficients b S , b A , b P , c A in (47 -49) are part of the improvement program. If properly set, O(a 2 ) scaling of phenomenological quantities can be achieved, but they may be set to zero if one is content with O(a) scaling. We use (47 -49) with tree-level values of the improvement coefficients, that is b S = b A = b P = 1 and c A = 0 . Formally, our results thus show cut-off effects proportional to αa, but in the scaling tests presented above cut-off effects proportional to a 2 seem to be numerically dominant. At this point we cannot anticipate whether a similar statement holds true for renormalized quark masses, and we shall thus consider both possibilities (i.e. leading cut-off effects proportional to αa or a 2 ). In any case the difference (in a given scheme, at a given µ) scales away with a → 0, hence m AWI = m VWI in the continuum.
In principle, m phys s and m phys ud may be determined using either definition of the quark mass, but in practice it proves beneficial to combine the specific advantages of the VWI and AWI approaches. Let us assume, for a moment, that we were to set all improvement coefficients to zero. Since the Lagrangian quark mass m bare is an exact input quantity which, after a universal shift has been applied, multiplicatively renormalizes with the unproblematic scalar density [cf. (47)], it is natural to use m W for quark mass differences, where the additive renormalization term m crit drops out. On the other hand, the PCAC quark mass m PCAC is perfectly suited to measure quark mass ratios, since in the ratio the multiplicative renormalization constants cancel [cf. (49) and we shall refer to this strategy as the "ratio-difference method". The renormalization scheme will be specified below. In practice, things are slightly more involved, as we intend to maintain tree-level improvement. Setting c A = 0 and b A = b P = 1 does not change anything in (48,49), but having a quadratic term in (47)  In the next subsection we will show that even with improvement, the renormalized quark masses are given by (50) with d → d imp , r → r imp , where the latter quantities are defined in (60, 61).

Ratio-difference method in full QCD with improvement
In the dynamical theory, the renormalization pattern of quark masses is slightly more involved than the familiar equations (47,49) suggest [42], but it turns out that our "ratio difference method" gets rid of these complications and the final relation is unchanged.
We now discuss how the findings of [42] apply to our method, using their notation, except that we do not use a "hat" to denote renormalized quantities, since they will come with a superscript "VWI" or "AWI", just as in the previous subsection. Equations (26,48) of [42] read f . These terms make the renormalized quark mass of flavor j depend on all other quark masses, too. Evidently, these terms come from quark loops in the functional determinant, and the perturbative expansion of the new improvement coefficientsb S ,b A ,b P shows that they start out at order g 4 0 , which means that they come through two-loop effects (one quark loop and a gluon loop which attaches it to the quark line whose renormalization is studied).
Upon considering the difference of two VWI masses and the ratio of two AWI masses where we have used d and r only in the leading term, so far. The point is that where the approximately equal sign means "up to terms of order O(a 2 )". Accordingly, we can express the difference of the VWI masses and the ratio of the AWI masses through d and r as where d imp and r imp are defined as In total this means that one finds am scheme ud and am scheme s via (50) with d → d imp , r → r imp . In our analysis, the tree-level improvement strategy makes all subleading terms in the square brackets of (60, 61) disappear, except for the one proportional to b S (with b S = 1).

Determination of the scalar RI/MOM renormalization factor
Having laid out our overall strategy for obtaining the renormalized quark masses m phys ud (µ), m phys s (µ) at the physical mass point in a standard scheme at a given scale µ, we now give details of how we compute the single renormalization factor needed, Z S (µ). We implement the nonperturbative Rome-Southampton method which defines the regularization-independent (RI/MOM) scheme [43], with several practical refinements (see below). In the terminology of [40][41][42] the result is the non-singlet renormalization factor Z NS S (µ). In the RI/MOM scheme   (66). The gauge coupling in this N f = 3 run is β = 3.61, the quark mass is am = −0.0045. This procedure yields a stable plateau for Z V . the running of Z S (µ) is known perturbatively to 4-loop order [44]. However, this is only relevant for the conversion to other schemes, e.g. MS at µ = 2 GeV. Our main result, m ud and m s in the RI/MOM scheme at µ = 4 GeV is derived without reference to perturbation theory.
In the RI/MOM scheme, renormalization conditions are defined in Landau gauge and require the forward, truncated quark Green's function of an operator to be equal to its tree-level value at a Euclidean four-momentum p, whose magnitude is chosen to be the renormalization scale. Given a quark bilinear operator O Γ 12 =ψ 2 Γψ 1 , where ψ 1 and ψ 2 are mass-degenerate quark fields and Γ is a Dirac matrix, the relevant Green's function is In this equation, S(p) is the propagator of quark flavors 1 and 2. Now, defining a projector P Γ such that tr{P Γ Γ} = 1 (the trace is over spin×color), the renormalization condition reads where Z ψ is the wavefunction renormalization factor and β=3.8 β=3.7 β=3.61 β=3.5 β=3.31 Figure 10: Renormalization factors Z RI S,β (µ 2 ) as a function of the bosonic momentum squared. For each β momenta µ ≤ π/(2a) are included. The data from the coarser lattices have been multiplied by a µ-independent factor to match those at β = 3.8. The solid line represents a Pade ansatz where the 1-loop anomalous dimension is built in as a constraint.
To determine Z S from the RI/MOM condition (63) with Γ = I, one needs to know Z ψ . In the original publication [43] the procedure was supplemented with a recipe to obtain Z ψ from the momentum dependence of the trace of the inverse propagator. Here we avoid the determination of this wave-function renormalization constant all together, by calculating the ratio Z S (µ)/Z V via the RI/MOM procedure and by combining it with Z V from the 3-point function with a vector-current insertion. In other words, on each ensemble we compute Z S (µ)/Z V using where the dependence on the coupling and the N f = 3 quark mass is indicated with subscripts. The bosonic momentum definitionp ν = (2/a) sin(ap ν /2) is used, and a standard cylinder cut around hyperdiagonal momenta is applied [45]. In addition, we determine Z V from the ratio where where b V = 1,b V = 0 have been used, and Fig. 9 shows the plateau from which we extract Z V,β,m . Combining this factor with the result of (65) yields Z S,β,m (µ), much in the spirit of [47,48].

Controlling the systematics in the RI/MOM procedure
RI/MOM is a mass independent scheme. Applied to the numerical data for Z RI S (µ) this means that we have to extrapolate all three flavors to vanishing sea and valence quark mass. For this reason, we have generated a series of dedicated N f = 3 lattices (i.e. with three degenerate quarks), where the action S = S Sym g + S SW f and the couplings β = 6/g 2 are the same as in the phenomenological ensembles. The bare parameters and statistics of these runs are summarized in Tab. 2. The specifics of the extrapolation will be discussed below.
In order to obtain tree-level O(a)-improved results with Wilson fermions, one has to improve not only the action, but also the interpolating fields. For standard correlators this has been discussed in the previous two subsections. In addition, in the RI/MOM procedure, one has to remove an O(a) contact term in the quark propagator [43]. We apply here the trace subtraction described in [49][50][51][52], which has the added benefit of greatly improving the signal to noise ratio. This subtraction is implemented by replacing the condition (63) by one in which the modified propagatorS(p) = S(p)−Tr(S(p))/4 is used to define the amputated Green's function, where the trace is in spinor space.
In order to reliably extract the renormalization constants and to convert the resulting quark masses m RI (µ) to other schemes without loosing precision, several conditions should be met: (a) the scale µ at which we take the continuum limit of the RI/MOM renormalized masses needs to be substantially below the momentum cutoff of the coarsest lattice µ ≪ 2π/a, (b) the conversion to a perturbative scheme has to be done at a scale µ ′ which is sufficiently large, such that perturbation theory is reliable, i.e. at µ ′ ≫ Λ QCD , (c) the effect of the chiral extrapolation m → 0 needs to be fully controlled. The difficulty to fulfill, in one simulation, the first two conditions is sometimes referred to as the "window problem" of the RI/MOM procedure. In the following we show how we can simultaneously satisfy all three requirements.
Ad (a): To renormalize our quark masses and to extrapolate them to the continuum we choose a convenient renormalization scale µ = 2.1 GeV. This scale satisfies µ < π/(2a) for all our lattices (on the coarsest one this figure is about 2.7 GeV). When plotting the running of Z S,β (µ) at different β on top of each other (see Fig. 10), one finds that discretization effects are below our statistical accuracy in this region, and that the form of the running is almost identical for our five β values.
Ad (b): With the procedure described above and by taking the continuum limit we obtain a fully nonperturbative determination of the quark masses in the RI/MOM scheme at µ = 2.1 GeV. In principle, we could stop here, quoting this as our main result. However, if one wants to convert this result to another scale or another scheme, it is evident from Fig. 12 that doing so perturbatively would introduce an uncertainty in the 1 − 2% range. Therefore we use our renormalization data to run our quark mass results, nonperturbatively, to the scale µ ′ = 4 GeV, where this perturbative uncertainty is in the 0.5% range and hence subdominant. At µ ′ = 4 GeV we still have 3 different β values which satisfy the condition µ ′ < π/(2a). More specifically, we use our data to extrapolate the ratio Z RI S,β (µ)/Z RI S,β (µ ′ ) to the continuum, with an extremely  [44], which differs from the numerically integrated one by 5-loop effects. In the labels this is called "4-loop/ana". mild effect (as one can see from Fig. 10, in this interval the three curves lie essentially on top of each other), and the resulting ratio provides us with the nonperturbative running of the scalar renormalization constant, in the continuum, between µ = 2.1 GeV and µ ′ = 4 GeV. Accordingly is the continuum extrapolated ratio which allows us to evolve data from all our lattices, including the coarser ones, to µ ′ = 4 GeV, where we perform the final continuum extrapolation. Through this procedure we obtain fully nonperturbatively renormalized quark masses in the RI/MOM scheme at µ ′ = 4 GeV, which represent our main result. For the reader's convenience we also convert them to other schemes. To this end we use 4-loop perturbative running to convert to the RGI framework (where we use the conventions of [53] with b 0 , d 0 adjusted to N f = 3), and subsequently to the MS scheme (which is perturbatively defined). Ad (c): As mentioned above, the RI/MOM scheme is a massless renormalization scheme. Since the dedicated N f = 3 simulations as listed in Tab. 2 use finite quark masses (roughly in the range m phys s /3 < m < m phys s ), we have to perform a chiral extrapolation at some point. In the procedure described in the previous paragraph, the numerical data for Z RI S,β,m (µ) were first extrapolated to the chiral limit to give Z RI S,β (µ). Based on this the renormalized quark masses m RI β (µ) and the ratios Z S,β (µ ′ )/Z S,β (µ) were extrapolated to the continuum, as detailed in (a) and (b), respectively. To give a reliable estimate of the systematic uncertainties involved, we supplement this procedure with a second one where we interchange the order of limits. Technically, this means that one defines an intermediate MOM scheme, which is not a massless one, but instead based on a fixed reference quark mass. We use m RGI ref = 70 MeV, since, for all β, this value can be reached by interpolation. In this scheme the renormalized light and strange quark masses are determined at the scales µ ∈ {1.3, 2.1} GeV, and extrapolated to the continuum. This yields m MOM ud,m ref (µ) and ditto for m s . Staying in this massive scheme, these quark masses are evolved to the scale µ ′ = 4 GeV. In this step a fully controlled continuum extrapolation can be performed, since we have three lattice spacings satisfying µ ′ < π/(2a). At this point we have the renormalized quark mass in the form where either factor has been extrapolated to the continuum. In the last step, we switch from the intermediate massive MOM scheme to the massless RI/MOM scheme by multiplying (70) with the continuum extrapolated ratio Z S,m ref (µ ′ )/Z S (µ ′ ). This yields the same m RI ud (µ ′ ), m RI s (µ ′ ) as before, except that the order of limits has been interchanged. Note that all continuum extrapolations are entirely flat and the effect of the mass extrapolation is about 1%, implying that all limiting procedures are fully controlled. Having obtained our main result, the RI/MOM masses at µ ′ = 4 GeV, we can transform them to other schemes as described under (b).

Summary of RI/MOM renormalization
Let us summarize this section. We compute the quark masses m phys ud and m phys s through the "ratio-difference method" in the RI/MOM scheme at the scale µ ′ = 4 GeV, nonperturbatively and with extrapolation to the continuum.
The mild quark mass dependence of the renormalization factors is eliminated through a chiral extrapolation. Also cut-off effects are removed through a continuum extrapolation. In this step we are extremely conservative -we do not only consider the formally leading cut-off effects O(αa), but also subleading effects proportional to O(a 2 ), counting the spread towards the final systematic error (see Sec. 14). We think this is necessary, since even with a set of 5 lattice spacings, we cannot exclude the possibility that the subleading O(a 2 ) cut-off effects largely affect the continuum extrapolation. If we were to consider only the leading O(αa) cut-off effects, our systematic error would be significantly smaller.
The quark masses in the RI/MOM scheme at the scale µ ′ = 4 GeV are our main result, obtained in a way which guarantees that they are truly nonperturbative. Using perturbation theory in a regime where it is well behaved, we convert them to the universal RGI prescription and subsequently to the perturbatively defined MS scheme at the scale µ = 2 GeV.

Quenched overall check
To demonstrate that our 2-step HEX smeared clover action (1) and the nonperturbative renormalization of the quark mass yield reliable results in the continuum, we repeat the quenched benchmark calculation [53] of the quantity m s +m ud , using our setup.
We use pure Wilson glue at five couplings between β = 5.7366 and β = 6.3, each time saving about 600 well-decorrelated configurations for the analysis (i.e. 200 for the Z-factors and 400 for the masses). The couplings and geometries have been chosen to realize a fixed physical box size of about L = 1.84 fm, see Tab. 3 for details. On each set at least 4 quark masses are used to safely interpolate to the point M P r 0 = 1.229, where M P is the pseudoscalar meson mass and where the numerical value has been chosen to match M phys K r 0 with r 0 = 0.49 fm [54]. The computation closely follows the dynamical case. We renormalize the VWI quark mass sum with the methods described in the previous section, and we use the same procedure to convert to the MS scheme. In more detail, we begin with measuring m PCAC as a function of  the bare mass m bare , which shows a linear relationship. The intercept with the x-axis yields m crit and thus m W as defined in (47). Next, we determine Z S (µ) via RI/MOM [43] to obtain m VWI according to (47). It is easy to see that this is the flavor non-singlet Z S (µ), since all quark disconnected contributions vanish in the quenched theory. In close analogy with our phenomenological analysis, we choose the matching scales µ = 2.1 GeV and µ ′ = 3.5 GeV.
Combining the continuum extrapolated ratio Z S (µ ′ )/Z S (µ) with Z S,β (µ), we obtain Z S,β (µ ′ ) and the renormalized mass m VWI (µ ′ ) in the RI/MOM scheme at the scale µ ′ = 3.5 GeV. Finally, we use perturbation theory to convert to the MS scheme at 2 GeV scale. The result is identified with m s + m ud in this scheme, at the given lattice spacing, and multiplied with r 0 to obtain a dimensionless quantity (cf. Tab. 3). We find that we can extrapolate these values linearly in αa or a 2 , with the data showing a slight preference for the latter option, as can be seen from the two panels in Fig. 13. Using the machinery for propagating both statistical and systematic errors that will be described in Sec. 14, the combined result in the continuum reads (m s + m ud )r 0 = 0.2609(39)(28) in the MS scheme at µ = 2 GeV. Our result is in perfect agreement with the continuum value (m s + m ud )r 0 = 0.261(9) quoted by the ALPHA collaboration [53]. It is consistent, within less than 1σ, with the result 0.274 (18) given by JLQCD [55] and, within less than 2σ, with the value 0.312(28) obtained in a computation with quenched overlap fermions that includes a continuum extrapolation [56]. There is some tension with the result 0.293(6) by CP-PACS [57], but one should keep in mind that the systematic uncertainty due to the perturbative renormalization is not included in their error. In short, we find good agreement with the most precise results in the literature. We take this as evidence that the renormalization procedure described in Sec. 11 yields reliable results.

Using dispersive input to obtain m u and m d
For decades the most reliable source of information on light quark masses has been current algebra, in particular in its modern form, known as Chiral Perturbation Theory (ChPT). A major drawback of this framework is that only information on quark mass ratios can be extracted, not on absolute values. This is a consequence of the fact that in the chiral Lagrangian all quark masses appear in the combination B 0 m q and the condensate parameter B 0 does not occur in any other instance. We have determined m ud ≡ (m u + m d )/2 and m s in MeV units. Accordingly, by comparing our value of the ratio m s /m ud to theirs, we can learn something about the convergence pattern of SU (3) ChPT. Furthermore, one may combine our values for m ud and m s with the best available information on another ratio (Q, see below) to obtain a result for the individual quark masses m u , m d .

Comparing our value of m s /m ud to the one in ChPT
13.2 Using dispersive information on Q to split m ud into m u and m d As mentioned in the previous subsection, ChPT is well suited to address the ratios m s /m d and m u /m d . A way to encode such information on quark mass ratios which, from the ChPT viewpoint, is particularly robust is to introduce the double ratio since this quantity is unaffected by next-to-leading order (NLO) effects in the chiral expansion. Modulo a tiny correction, (82) can be put into a form known as "Leutwyler's ellipse" [59] 1 and relying on Dashen's theorem [58] or refinements thereof (see e.g. [21]), one might attempt to determine the value of Q from the masses of the charged and neutral kaon and pion.
Since we intend to use (82) to predict the isospin splittings in QCD (i.e. without electromagnetism), it seems more advisable to build on the long tradition in the phenomenological literature to determine Q from the rate for η → 3π decays or from the branching ratio of ψ ′ → ψπ 0 versus ψ ′ → ψη decays. The former amplitude seems particularly interesting, as it violates isospin, while being barely affected by electromagnetic corrections [60]. Evidently, this renders it sensitive to the effect of m d −m u = 0, which is exactly what we are interested in. In the following, we restrict ourselves to the dispersive treatment of the η → 3π amplitude, as given by Kambor-Wiesendanger-Wyler [61], Anisovich-Leutwyler [62], and Colangelo-Lanz-Passemar [63]. In the first place we note that the central value found in these works has been remarkably consistent over one and a half decades. Let us also emphasize that a dispersive treatment is, conceptually, as much from first principles as a lattice computation -dispersion theory rests exclusively on the axioms of quantum field theory. In a world with perfect experimental data, this would be the complete story. However, with presently available data, additional input is required (see e.g. [63]). To account for such provisional effects, Leutwyler has assigned his estimate Q = 22.3 (8) [64] a much larger error bar than claimed in some of the publications it is based on. In our view this is the most accurate value available, if one is not willing to resort to model calculations, and we shall thus stay content with its rather conservative error bar.
We now extend our lattice determinations of m ud and m s /m ud to all three quark masses, using this dispersive information. Upon rewriting (82, 83) in the form it follows that the above-mentioned value of Q and our lattice result m s m ud = 27.53 (20)(08) yield the light quark mass asymmetry parameter where the error on Q is considered a systematic error. As an aside we mention that this asymmetry parameter is equivalent to m u /m d = 0.448(06) (29). Combining (86)

Physics implication, robustness issues and precision outlook
Physicswise, an important conclusion is that our result (86) for the light quark mass asymmetry parameter excludes a vanishing up-quark mass by 22.1 standard deviations. This is a consequence of the dispersive determination of Q being entirely inconsistent with 13.8, the value of Q which relation (84) and our result for m s /m ud would enforce if m u = 0. As can be seen from (84), the asymmetry parameter depends strongly on the ratio m s /m ud , which is the quantity that we have determined to sub-percent precision. The bottom line is that our precise lattice results and the dispersive processing of phenomenological information which excludes very large corrections to Dashen's theorem, when combined, rule out the simplest proposed solution to the so-called "strong CP problem". This corroborates previous findings [59].
Note that the way in which we have used phenomenological information is designed to make sure that the so-called "Kaplan-Manohar ambiguity" is circumvented in our derivation of m u and m d . This ambiguity expresses the fact that a redefinition of the quark condensate and of certain low-energy constants allows one to move on Leutwyler's ellipse [65]. It represents an accidental symmetry of those Green's functions in the effective theory which determine pseudoscalar masses, scattering amplitudes and matrix elements of the vector and axial-vector currents [59]. However, the aspect ratio of Leutwyler's ellipse is not affected by this ambiguity, and it is this shape information 5 which is encoded in Q. In consequence, relation (84) ensures 5 We remark that Q as defined in (82) picks up, under a Kaplan-Manohar transformation, terms of order NNLO and a change proportional to m d −m u . The latter "deficiency" could be cured by defining Q 2 1 = (m 2 s −m 2 d )/(m 2 d −m 2 u ) [66]. Note, however, that the numerical difference between Q and Q 1 [or Q 2 , the quantity that shows up in (84)] is about one permil, i.e. more than an order of magnitude smaller than the uncertainty that we have assigned to Q. that the high precision that we have reached in m s /m ud , together with the robust value of Q that we use, leads to a determination of the asymmetry parameter (86) and thus of the individual u and d quark masses which is unaffected by the Kaplan-Manohar ambiguity.
We stress that, in our view, there is not much conceptual difference between using only M π , M K as input quantities versus including Q, too. To compute m ud , m s , we needed two (polished) experimental input numbers to adjust the average light and the strange quark masses, apart from M Ω to set the overall scale (cf. Sec. 4). To compute m u , m d , m s , evidently, we need a third one, and we are well advised to choose one which is sensitive to the effect we want to quantify. We select Q for its large sensitivity to QCD-induced isospin breaking, thus requiring very little theoretical polishing, and for this little bit resting on dispersion theory which is well founded. Still, there is room for improvement, as can be seen from the fact that our value of m ud had 2% precision, while m u and m d have only 5% and 3% accuracy, respectively. The problem is that the current value of Q determines the asymmetry parameter (86) to only about 7% precision. While improvements on the value of Q obtained in this way may be possible [63], reaching accuracies of m u , m d below the few percent level will most probably require a different approach, even more heavily based on lattice field theory. Indeed, once simulations become available with N f = 1 + 1 + 1 physical quark flavors (i.e. with nondegenerate up, down, and strange quark masses, each of which is taken at its physical value) and with an additional abelian gauge field 6 to account for electromagnetic interactions, it will become possible to take full advantage of the very accurately known K + and K 0 masses to determine m u and m d with even higher precision.

Assessment of systematic errors
Our approach is to establish one global fit to interpolate our 11+12+9+9+6 = 47 simulations at 5 different lattice spacings (cf. Tab. 1) to the physical mass point (i.e. physical M π and M K ) and to extrapolate to zero lattice spacing (i.e. a → 0). In order to obtain a reliable estimate of the systematic error involved, we repeat the entire analysis with a large selection of interpolation formulae, mass cuts, discretization terms, fit ranges, and renormalization procedures.
In order to extrapolate or interpolate a given quantity to the physical quark mass point, one needs to expand it around some pion and kaon mass point. Often the N f = 2 or N f = 3 chiral limit is chosen as an expansion point and hence SU(2) or SU(3) ChPT [37,38] as the theoretical framework. Expressing the dependence on the light quark mass as a dependence on M π , this kind of expansion leads, for a quantity which vanishes in the chiral limit, to a quadratic term ∝ M 2 π and higher order chiral logs, e.g. ∝ M 4 π log(M 2 π /Λ 2 ), with known prefactors but unknown scale Λ. In many cases the practical usefulness of knowing the prefactors is limited, since they contain other quantities (e.g. the axial coupling g A for octet baryons) which may not be available from the same simulation and which one may not want to borrow from phenomenology. Furthermore, it is rather difficult for a fit to tell, e.g., a pure M 4 π contribution from an M 4 π log(M 2 π /Λ 2 ) chiral log. Accordingly, choosing an expansion point for an interpolation somewhere in the middle of the region where one has data (or in the middle of the region defined by the data points and the target point in case of an extrapolation) and using a simple Taylor expansion in M 2 π leads to rather similar results [2]. To flesh out the meaning of these statements, let us consider the quantities of interest, m ud and m s . In our analysis we use the NLO mass formulae (45) from SU(2) ChPT [37], albeit in reversed form, so that it expresses m ud as a function of M π . To the order we are working at, this can be done in several ways [the difference is an NNLO effect]; we use the relations where we have introduced a hadronic quantity to parametrize the small deviation of our strange quark mass from its physical value [2]. Alternatively, for the light quark mass we use a Taylor expansion of the form while the strange quark mass is always parametrized as We have tried to augment these formulas by higher order terms, both in M 2 π and ∆, but we found those coefficients to be consistent with zero, with the given precision of our data. This yields 3 options for the mass interpolation or extrapolation of the pseudoscalars. Similarly, for the Ω baryon that serves to set the scale, a Taylor ansatz in M 2 π and 2M 2 K −M 2 π is used (cf. Sec. 4). In total we have 3 functional ansaetze to interpolate our data.
A standard way to test the functional ansatz is to prune the data with mass cuts. We use M π < {380, 480} MeV for the scale setting and M π < {340, 380} MeV for the quark mass determination, thus a total of 4 mass cuts.
A source of error which, in practice, often proves dominant is the contamination of the ground state in the two-point correlator by excited states. To reduce this contamination we use a Gaussian source and sink with a fixed width of about 0.32 fm. We tested 1-state and 2-state fits, and found complete agreement if the 1-state fits start at t min ≃ 0.7 fm for the P P, P A 4 , A 4 P, A 4 A 4 meson channels and from t min ∼ 0.8 fm for the Ω. In lattice units this amounts to at min = {6, 8,9,11, 13} for β = {3.31, 3.5, 3.61, 3.7, 3.8} (and ∼ 20% later for baryons). In order to estimate any remaining excited state effects, we repeated our analysis with an even more conservative meson fit range (starting at at min = {7, 9,11,13 Table 4: Split-up of the total systematic uncertainty of m phys ud , m phys s and m phys s /m phys ud (from top to bottom) into the various contributions. Entries in columns 1-3 are in MeV and refer to the RI/MOM scheme at µ = 4 GeV. Columns 4-10 indicate the relative share of the systematic error given in column 3 (the squares of these numbers add up to 1). The headers of these columns refer to the plateau range in the primary observables, the overall scale setting, the interpolation ansatz to tune to the physical mass point, the cut in the pion mass, the details of the renormalization procedure (read-off scale, chiral extrapolation), and the continuum extrapolation.
∼ 20% later for baryons). The end of the fit interval was always chosen to be at max = 2.7×at min or T /2 − 1 for lattices with a time extent shorter than 5.4 × at min . In all cases, the fits were performed in a correlated way. In total this gives 2 different fit ranges to make sure that contamination by excited states is under control.
As a result of the tree-level value c SW = 1 our action has formally O(αa) cut-off effects. However, due to the smearing the coefficient in front of this term is small, and the formally subleading O(a 2 ) contributions might numerically dominate over the O(αa) part. To account for this we augment our global fit by cutoff terms which stipulate either O(αa) or O(a 2 ) deviation from the continuum. This ambiguity comes into play in the evolution function (69) and in the continuum extrapolation of the quark masses in the RI/MOM scheme, which yields 4 options.
Besides the variations described above, we consider 3 options in the nonperturbative renormalization procedure (scale µ, massless versus massive intermediate scheme), see Sec. 11.
All of this serves the goal of quantifying potential systematic effects on our final results. In addition, there are standard methods to assess the size of the statistical error. Apart from the autocorrelation analysis detailed in Sec. 7, we used different blocking sizes on our ensembles, ranging from 1 to 10 configurations, where two adjacent configurations are separated by ten τ = 1 MD updates (cf. Sec. 5). Last but not least, we found that artificial thermalization cuts (where we ignore the first 20-100 configurations of the thermalized ensembles) induce no noticeable change in our results, and therefore we conclude that possible residual thermalization effects are irrelevant for the error analysis.
Putting everything together we have 3 ansaetze for the interpolation of the quark masses to the physical point, 4 mass cuts in the scale setting and the quark mass determination, 2 different fit intervals for the primary observables, 4 ansaetze for the continuum extrapolation, and 3 ways of doing the RI/MOM renormalization. This gives a total of 3 · 4 · 2 · 4 · 3 = 288 analyses.
In order to quote a final result, we follow the procedure used in [2]. It is essential to note that we have no a priori reason to favor one of these fits over another. Therefore the analysis method should represent the full spread of all reasonable and theoretically justified treatments of our data. In other words, we use all 288 procedures, and weigh the results by the quality of  Table 5: Renormalized quark masses in the RI/MOM scheme at µ = 4 GeV, and after conversion to RGI and the MS scheme at µ = 2 GeV. The RI/MOM values are fully nonperturbative, so the first line is our main result. The first two columns emerge directly from our lattice calculation, the last two build, in addition, on dispersive information on Q. The precision of m s and m ud is somewhat below the 2% level, for m u and m d it is about 5% and 3%, respectively. The ratio m s /m ud = 27.53 (20)(08) is independent of the scheme and accurate to better than 1%.
fit Q = Γ(n/2, χ 2 /2) to form a histogram. Next, we compute the mean and standard deviation of the distribution, and this yields the central value and the systematic error which we quote. Finally, we repeat this extensive procedure on 2000 bootstrap samples. The standard bootstrap error of the mean gives the statistical error. An additional benefit of our method to treat systematic effects is that we can temporarily suppress one of the variations considered (i.e. abandon one of the factors who's product leads to the 288 procedures) to learn about the contribution of this individual factor to the total error. The total "error budget" compiled in this way is shown in Tab. 4. Evidently, it exhibits the cut-off effects as the dominant source of systematic uncertainty in our results.
All together, our procedure to assess both statistical and systematic errors is an extended frequentist method [20] which was already used in [2].

Summary
We have carried out a precise determination of the average light quark mass m ud = (m u + m d )/2 and of the strange quark mass m s , using nonperturbative N f = 2 + 1 lattice QCD and nonperturbative renormalization throughout. Our data cover 5 lattice spacings in the range 0.054−0.116 fm, with pion masses down to ∼ 120 fm and box sizes up to 6 fm. This allows for a safe extrapolation to the continuum (a → 0) and to infinite volume (L → ∞).
We have devised a number of innovative methods, most notably a scheme to exploit the different renormalization pattern of Wilson and PCAC quark masses with tree-level O(a)improved clover quarks and a procedure to overcome the RI/MOM window problem by taking a separate continuum limit of the running of the scalar density R S (µ, µ ′ ).
Our main result, m s and m ud in the RI/MOM scheme at renormalization scale µ = 4 GeV (cf. Tab. 5), is from first principles and fully nonperturbative. To ease comparison with the literature, these values are converted to RGI conventions and, subsequently, to the MS scheme. In this step reference to perturbation theory is unavoidable, but we do this in a controlled way, since we show that the 4-loop anomalous dimension of the scalar density is consistent with our a=0.06 fm a=0.08 fm a=0.10 fm a=0.12 fm  nonperturbative running for µ > ∼ 4 GeV. The ratio m s /m ud is scheme and scale invariant. It turns out that our action entails favorable scaling properties not just for hadron masses, but also for renormalized quark masses, as the plot of a representative continuum extrapolation in Fig. 14 shows. The combination of using pion masses down to (and even below) the physical value and having precise and fully nonperturbative renormalization factors allows us to determine m s and m ud with a precision of better than 2%, and their ratio to better than 1%.
A determination of the individual light quark masses m u and m d by lattice methods alone is beyond the scope of this paper. Nevertheless, the precision of our values of m ud and m s /m ud allows for a fruitful use of the result of the dispersive analysis of the double ratio Q (cf. discussion in Sec. 13). By combining these pieces of information, we obtain values of the individual quark masses m u and m d with a precision of 5% and 3%, respectively (cf. Tab. 5).