1 Introduction

Flavour physics provides an important opportunity for exploring the limits of the Standard Model of particle physics and for constraining possible extensions of theories that go beyond it. As the LHC explores a new energy frontier and as experiments continue to extend the precision frontier, the importance of flavour physics will grow, both in terms of searches for signatures of new physics through precision measurements and in terms of attempts to unravel the theoretical framework behind direct discoveries of new particles. A major theoretical limitation consists in the precision with which strong interaction effects can be quantified. Large-scale numerical simulations of lattice QCD allow for the computation of these effects from first principles. The scope of the Flavour Lattice Averaging Group (FLAG) is to review the current status of lattice results for a variety of physical quantities in low-energy physics. Set up in November 2007,Footnote 1 it comprises experts in Lattice Field Theory and Chiral Perturbation Theory. Our aim is to provide an answer to the frequently posed question “What is currently the best lattice value for a particular quantity?”, in a way which is readily accessible to non-lattice-experts. This is generally not an easy question to answer; different collaborations use different lattice actions (discretisations of QCD) with a variety of lattice spacings and volumes, and with a range of masses for the \(u\)- and \(d\)-quarks. Not only are the systematic errors different, but also the methodology used to estimate these uncertainties varies between collaborations. In the present work we summarise the main features of each of the calculations and provide a framework for judging and combining the different results. Sometimes it is a single result which provides the “best” value; more often it is a combination of results from different collaborations. Indeed, the consistency of values obtained using different formulations adds significantly to our confidence in the results.

The first edition of the FLAG review was published in 2011 [1]. It was limited to lattice results related to pion and kaon physics: light-quark masses (\(u\)-, \(d\)- and \(s\)-flavours), the form factor \(f_+(0)\) arising in semileptonic \(K \rightarrow \pi \) transitions at zero momentum transfer and the decay constant ratio \(f_K/f_\pi \), as well as their implications for the CKM matrix elements \(V_{us}\) and \(V_{ud}\). Furthermore, results were reported for some of the low-energy constants of \(\hbox {SU}(2)_L \otimes \hbox {SU}(2)_R\) and \(\hbox {SU}(3)_L \otimes \hbox {SU}(3)_R\) Chiral Perturbation Theory and the \(B_K\) parameter of neutral kaon mixing. Results for all of these quantities have been updated in the present paper. Moreover, the scope of the present review has been extended by including lattice results related to \(D\)- and \(B\)-meson physics. We focus on \(B\)- and \(D\)-meson decay constants, form factors, and mixing parameters, which are most relevant for the determination of CKM matrix elements and the global CKM unitarity-triangle fit. Last but not least, the current status of lattice results on the QCD coupling \(\alpha _\mathrm{s}\) is also reviewed. Bottom- and charm-quark masses, though important parametric inputs to Standard Model calculations, have not been covered in the present edition. They will be included in a future FLAG report.

Our plan is to continue providing FLAG updates, in the form of a peer-reviewed paper, roughly on a biannual basis. This effort is supplemented by our more frequently updated website http://itpwiki.unibe.ch/flag, where figures as well as pdf-files for the individual sections can be downloaded. The papers reviewed in the present edition have appeared before the closing date 30 November 2013.

Finally, we draw attention to a particularly important point. As stated above, our aim is to make lattice QCD results easily accessible to non-lattice-experts and we are well aware that it is likely that some readers will only consult the present paper and not the original lattice literature. We consider it very important that this paper is not the only one which gets cited when the lattice results which are discussed and analysed here are quoted. Readers who find the review and compilations offered in this paper useful are therefore kindly requested to also cite the original sources. The bibliography at the end of this paper should make this task easier. Indeed we hope that the bibliography will be one of the most widely used elements of the whole paper.

This review is organised as follows. In the remainder of Sect. 1 we summarise the composition and rules of FLAG, describe the goals of the FLAG effort and general issues that arise in modern lattice calculations. For the reader’s convenience, Table 1 summarises the main results (averages and estimates) of the present review. In Sect. 2 we explain our general methodology for evaluating the robustness of lattice results which have appeared in the literature. We also describe the procedures followed for combining results from different collaborations in a single average or estimate (see Sect. 2.2 for our use of these terms). The rest of the paper consists of sections, each of which is dedicated to a single (or groups of closely connected) physical quantity(ies). Each of these sections is accompanied by an Appendix with explicatory notes.

Table 1 Summary of the main results of this review, grouped in terms of \(N_\mathrm{f}\), the number of dynamical quark flavours in lattice simulations. Quark masses and the quark condensate are given in the \({\overline{\mathrm{MS}}}\) scheme at running scale \(\mu =2\,\mathrm{GeV}\); the other quantities listed are specified in the quoted sections. The columns marked indicate the number of results that enter our averages for each quantity. We emphasise that these numbers only give a very rough indication of how thoroughly the quantity in question has been explored on the lattice and recommend to consult the detailed tables and figures in the relevant section for more significant information For explanations on the source of the quoted errors for each quantity, the reader is advised to consult the corresponding section, as indicated in the second column

1.1 FLAG enlargement

Upon completion of the first review, it was decided to extend the project by adding new physical quantities and co-authors. FLAG became more representative of the lattice community, both in terms of the geographical location of its members and the lattice collaborations to which they belong. At the time a parallel effort had been made [2, 3]; the two efforts have now merged in order to provide a single source of information on lattice results to the particle-physics community.

The experience gained in managing the activities of a medium-sized group of co-authors taught us that it was necessary to have a more formal structure and a set of rules by which all concerned had to abide, in order to make the inner workings of FLAG function smoothly. The collaboration presently consists of an Advisory Board (AB), an Editorial Board (EB), and seven Working Groups (WG). The rôle of the Advisory Board is that of general supervision and consultation. Its members may interfere at any point in the process of drafting the paper, expressing their opinion and offering advice. They also give their approval of the final version of the preprint before it is rendered public. The Editorial Board coordinates the activities of FLAG, sets priorities and intermediate deadlines, and takes care of the editorial work needed to amalgamate the sections written by the individual working groups into a uniform and coherent review. The working groups concentrate on writing up the review of the physical quantities for which they are responsible, which is subsequently circulated to the whole collaboration for criticisms and suggestions.

The most important internal FLAG rules are the following:

  • members of the AB have a 4-year mandate (to avoid a simultaneous change of all members, some of the current members of the AB will have a shorter mandate);

  • the composition of the AB reflects the main geographical areas in which lattice collaborations are active: one member comes from America, one from Asia/Oceania and one from Europe;

  • the mandate of regular members is not limited in time, but we expect that a certain turnover will occur naturally;

  • whenever a replacement becomes necessary this has to keep, and possibly improve, the balance in FLAG;

  • in all working groups the three members must belong to three different lattice collaborations;Footnote 2

  • a paper is in general not reviewed (nor colour-coded, as described in the next section) by one of its authors;

  • lattice collaborations not represented in FLAG will be asked to check whether the colour coding of their calculation is correct.

The current list of FLAG members and their Working Group assignments is:

\(\bullet \) Advisory Board (AB):    S. Aoki, C. Bernard, C. Sachrajda

\(\bullet \) Editorial Board (EB):    G. Colangelo, H. Leutwyler,

A. Vladikas, U. Wenger

\(\bullet \) Working Groups (WG)

(each WG coordinator is listed first):

  • Quark masses:    L. Lellouch, T. Blum, V. Lubicz

  • \(V_{us},V_{ud}\):    A. Jüttner, T. Kaneko, S. Simula

  • LEC:    S. Dürr, H. Fukaya, S. Necco

  • \(B_K\):    H. Wittig, J. Laiho, S. Sharpe

  • \(f_{B_{(s)}}\), \(f_{D_{(s)}}\), \(B_B\):    A. El-Khadra,Y. Aoki, M. Della Morte

  • \(B_{(s)}\), \(D\) semileptonic and radiative decays:    R. Van de Water, E. Lunghi, C. Pena, J. ShigemitsuFootnote 3

  • \(\alpha _\mathrm{s}\):    R. Sommer, R. Horsley, T. Onogi

1.2 General issues and summary of the main results

The present review aims at two distinct goals:

  1. (a)

    offer a description of the work done on the lattice concerning low-energy particle physics;

  2. (b)

    draw conclusions on the basis of that work, which summarise the results obtained for the various quantities of physical interest.

The core of the information about the work done on the lattice is presented in the form of tables, which not only list the various results, but also describe the quality of the data that underlie them. We consider it important that this part of the review represents a generally accepted description of the work done. For this reason, we explicitly specify the quality requirements used and provide sufficient details in the appendices so that the reader can verify the information given in the tables.

The conclusions drawn on the basis of the available lattice results, on the other hand, are the responsibility of FLAG alone. We aim at staying on the conservative side and in several cases reach conclusions which are more cautious than what a plain average of the available lattice results would give, in particular when this is dominated by a single lattice result. An additional issue occurs when only one lattice result is available for a given quantity. In such cases one does not have the same degree of confidence in results and errors as one has when there is agreement among many different calculations using different approaches. Since this degree of confidence cannot be quantified, it is not reflected in the quoted errors, but it should be kept in mind by the reader. At present, the issue of having only a single result occurs much more often in heavy-quark physics than in light-quark physics. We are confident that the heavy-quark calculations will soon reach the state that pertains in light-quark physics.

Several general issues concerning the present review are thoroughly discussed in Sect. 1.1 of our initial paper [1] and we encourage the reader to consult the relevant pages. In the remainder of the present section, we focus on a few important points.

Each discretisation has its merits but also its shortcomings. For the topics covered already in the first edition of the FLAG review, we have by now a remarkably broad data base, and for most quantities lattice calculations based on totally different discretisations are now available. This is illustrated by the dense population of the tables and figures shown in the first part of this review. Those calculations which do satisfy our quality criteria indeed lead to consistent results, confirming universality within the accuracy reached. In our opinion, the consistency between independent lattice results, obtained with different discretisations, methods and simulation parameters, is an important test of lattice QCD, and observing such consistency then also provides further evidence that systematic errors are fully under control.

In the sections dealing with heavy quarks and with \(\alpha _\mathrm{s}\), the situation is not the same. Since the \(b\)-quark mass cannot be resolved with current lattice spacings, all lattice methods for treating \(b\) quarks use effective field theory at some level. This introduces additional complications not present in the light-quark sector. An overview of the issues specific to heavy-quark quantities is given in the introduction of Sect. 8. For \(B\)- and \(D\)-meson leptonic decay constants, there already exist a good number of different independent calculations that use different heavy-quark methods, but there are only one or two independent calculations of semileptonic \(B\)- and \(D\)-meson form factors and \(B\) meson mixing parameters. For \(\alpha _\mathrm{s}\), most lattice methods involve a range of scales that need to be resolved and controlling the systematic error over a large range of scales is more demanding. The issues specific to determinations of the strong coupling are summarised in Sect. 9.

The lattice spacings reached in recent simulations go down to 0.05 fm or even smaller. In that region, growing autocorrelation times slow down the sampling of the configurations [48]. Many groups check for autocorrelations in a number of observables, including the topological charge, for which a rapid growth of the autocorrelation time is observed if the lattice spacing becomes small. In the following, we assume that the continuum limit can be reached by extrapolating the existing simulations.

Lattice simulations of QCD currently involve at most four dynamical quark flavours. Moreover, most of the data concern simulations for which the masses of the two lightest quarks are set equal. This is indicated by the notation \(N_\mathrm{f}=2+1+1\) which, in this case, denotes a lattice calculation with four dynamical quark flavours and \(m_{u} = m_{d} \ne m_{s} \ne m_{c}\). Note that calculations with \(N_\mathrm{f}=2\) dynamical flavours often include strange valence quarks interacting with gluons, so that bound states with the quantum numbers of the kaons can be studied, albeit neglecting strange sea quark fluctuations. The quenched approximation (\(N_\mathrm{f}=0\)), in which the sea quarks are treated as a mean field, is no longer used in modern lattice simulations. Accordingly, we will review results obtained with \(N_\mathrm{f}=2\), \(N_\mathrm{f}=2+1\) and \(N_\mathrm{f} = 2+1+1\), but we omit earlier results with \(N_\mathrm{f}=0\). On the other hand, the dependence of the QCD coupling constant \(\alpha _\mathrm{s}\) on the number of flavours is a theoretical issue of considerable interest, and we therefore include results obtained for gluodynamics in the \(\alpha _\mathrm{s}\) section. We stress, however, that only results with \(N_\mathrm{f} \ge 3\) are used to determine the physical value of \(\alpha _\mathrm{s}\) at a high scale.

The remarkable recent progress in the precision of lattice calculations is due to improved algorithms, better computing resources and, last but not least, conceptual developments, such as improved actions which reduce lattice artefacts, actions which preserve (remnants of) chiral symmetry, understanding finite-size effects, non-perturbative renormalisation, etc. A concise characterisation of the various discretisations that underlie the results reported in the present review is given in Appendix A.1.

Lattice simulations are performed at fixed values of the bare QCD parameters (gauge coupling and quark masses) and physical quantities with mass dimensions (e.g. quark masses, decay constants...) are computed in units of the lattice spacing; i.e. they are dimensionless. Their conversion to physical units requires knowledge of the lattice spacing at the fixed values of the bare QCD parameters of the simulations. This is achieved by requiring agreement between the lattice calculation and experimental measurement of a known quantity, which “sets the scale” of a given simulation. A few details on this procedure are provided in Appendix A.2.

Several of the results covered by this review, such as quark masses, the gauge coupling, and \(B\)-parameters, are quantities defined in a given renormalisation scheme and scale. The schemes employed are often chosen because of their specific merits when combined with the lattice regularisation. For a brief discussion of their properties, see Appendix A.3. The conversion of the results, obtained in these so-called intermediate schemes, to more familiar regularisation schemes, such as the \({\overline{\mathrm{MS}}}\)-scheme, is done with the aid of perturbation theory. It must be stressed that the renormalisation scales accessible by the simulations are subject to limitations, naturally arising in Field-Theory computations at finite UV and small non-zero IR cutoff. Typically, such scales are of the order of the UV cutoff, or \(\Lambda _\mathrm{QCD}\), depending on the chosen scheme. To safely match to \({\overline{\mathrm{MS}}}\), a scheme defined in perturbation theory, Renormalisation Group (RG) running to higher scales is performed, either perturbatively, or non-perturbatively (the latter using finite-size scaling techniques).

Because of limited computing resources, lattice simulations are often performed at unphysically heavy pion masses, although results at the physical point have recently become available. Further, numerical simulations must be done at finite lattice spacing. In order to obtain physical results, lattice data are generated at a sequence of pion masses and a sequence of lattice spacings, and then extrapolated to \(M_\pi \approx 135\) MeV and \(a \rightarrow 0\). To control the associated systematic uncertainties, these extrapolations are guided by effective theory. For light-quark actions, the lattice-spacing dependence is described by Symanzik’s effective theory [9, 10]; for heavy quarks, this can be extended and/or supplemented by other effective theories such as Heavy-Quark Effective Theory (HQET). The pion-mass dependence can be parameterised with Chiral Perturbation Theory (\(\chi \)PT), which takes into account the Nambu–Goldstone nature of the lowest excitations that occur in the presence of light quarks; similarly one can use Heavy-Light Meson Chiral Perturbation Theory (HM\(\chi \)PT) to extrapolate quantities involving mesons composed of one heavy (\(b\) or \(c\)) and one light quark. One can combine Symanzik’s effective theory with \(\chi \)PT to simultaneously extrapolate to the physical pion mass and continuum; in this case, the form of the effective theory depends on the discretisation. See Appendix A.4 for a brief description of the different variants in use and some useful references.

2 Quality criteria

The essential characteristics of our approach to the problem of rating and averaging lattice quantities reported by different collaborations have been outlined in our first publication [1]. Our aim is to help the reader assess the reliability of a particular lattice result without necessarily studying the original article in depth. This is a delicate issue, which may make things appear simpler than they are. However, it safeguards against the common practice of using lattice results and drawing physics conclusions from them, without a critical assessment of the quality of the various calculations. We believe that despite the risks, it is important to provide some compact information about the quality of a calculation. However, the importance of the accompanying detailed discussion of the results presented in the bulk of the present review cannot be underestimated.

2.1 Systematic errors and colour-coding

In Ref. [1], we identified a number of sources of systematic errors, for which a systematic improvement is possible, and assigned one of three coloured symbols to each calculation: green star, amber disc or red square. The appearance of a red tag, even in a single source of systematic error of a given lattice result, disqualified it from the global averaging. Since results with green and amber tags entered the averages, and since this policy has been retained in the present edition, we have decided to substitute the amber disc by a green unfilled circle. Thus the new colour coding is as follows:

  • the systematic error has been estimated in a satisfactory manner and convincingly shown to be under control;

  • a reasonable attempt at estimating the systematic error has been made, although this could be improved;

  • no or a clearly unsatisfactory attempt at estimating the systematic error has been made. We stress once more that only results without a red tag in the systematic errors are averaged in order to provide a given FLAG estimate.

The precise criteria used in determining the colour coding is unavoidably time-dependent; as lattice calculations become more accurate the standards against which they are measured become tighter. For quantities related to the light-quark sector, which have been dealt with in the first edition of the FLAG review [1], some of the quality criteria have remained the same, while others have been tightened up. We will compare them to those of Ref. [1], case-by-case, below. For the newly introduced physical quantities, related to heavy quark physics, the adoption of new criteria was necessary. This is due to the fact that, in most cases, the discretisation of the heavy quark action follows a very different approach to that of light flavours. Moreover, the two Working Groups dedicated to heavy flavours have opted for a somewhat different rating of the extrapolation of lattice results to the continuum limit. Finally, the strong coupling being in a class of its own, as far as methods for its computation are concerned, led to the introduction of dedicated rating criteria for it.

Of course any colour coding has to be treated with caution; we repeat that the criteria are subjective and evolving. Sometimes a single source of systematic error dominates the systematic uncertainty and it is more important to reduce this uncertainty than to aim for green stars for other sources of error. In spite of these caveats we hope that our attempt to introduce quality measures for lattice results will prove to be a useful guide. In addition we would like to stress that the agreement of lattice results obtained using different actions and procedures evident in many of the tables presented below provides further validation.

For a coherent assessment of the present situation, the quality of the data plays a key role, but the colour coding cannot be carried over to the figures. On the other hand, simply showing all data on equal footing would give the misleading impression that the overall consistency of the information available on the lattice is questionable. As a way out, the figures do indicate the quality in a rudimentary way:

  • results included in the average;

  • results that are not included in the average but pass all quality criteria;

  • all other results.

The reason for not including a given result in the average is not always the same: the paper may fail one of the quality criteria, may not be published, be superseded by other results or not offer a complete error budget. Symbols other than squares are used to distinguish results with specific properties and are always explained in the caption.

There are separate criteria for light-flavour, heavy-flavour, and \(\alpha _\mathrm{s}\) results. In the following the criteria for the former two are discussed in detail, while the criteria for the \(\alpha _\mathrm{s}\) results will be exposed separately in Sect. 9.2.

2.1.1 Light-quark physics

The colour code used in the tables is specified as follows:

\(\bullet \) Chiral extrapolation:

  • \(M_{\pi ,{\mathrm {min}}}< 200\) MeV

  • 200 MeV \(\le M_{\pi ,{\mathrm {min}}} \le \) 400 MeV

  • 400 MeV \( < M_{\pi ,{\mathrm {min}}}\) It is assumed that the chiral extrapolation is done with at least a three-point analysis; otherwise this will be explicitly mentioned. Note that, compared to Ref. [1], chiral extrapolations are now treated in a somewhat more stringent manner and the cutoff between green star and green open circle (formerly amber disc), previously set at 250 MeV, is now lowered to 200 MeV.

\(\bullet \) Continuum extrapolation:

  • three or more lattice spacings, at least two points below 0.1 fm

  • two or more lattice spacings, at least one point below 0.1 fm

  • otherwise

    It is assumed that the action is \(O(a)\)-improved (i.e. the discretisation errors vanish quadratically with the lattice spacing); otherwise this will be explicitly mentioned. Moreover, for non-improved actions an additional lattice spacing is required. This criterion is the same as the one adopted in Ref. [1].

\(\bullet \) Finite-volume effects:

  • \(M_{\pi ,{\mathrm {min}}} L > 4\) or at least three volumes

  • \(M_{\pi ,{\mathrm {min}}} L > 3\) and at least two volumes

  • otherwise

    These ratings apply to calculations in the \(p\)-regime and it is assumed that \(L_\mathrm{min}\ge 2\) fm; otherwise this will be explicitly mentioned and a red square will be assigned.

\(\bullet \) Renormalisation (where applicable):

  • non-perturbative

  • one-loop perturbation theory or higher with a reasonable estimate of truncation errors

  • otherwise

    In Ref. [1], we assigned a red square to all results which were renormalised at one loop in perturbation theory. We now feel that this is too restrictive, since the error arising from renormalisation constants, calculated in perturbation theory at one loop, is often estimated conservatively and reliably.

\(\bullet \) Running (where applicable):

  • For scale-dependent quantities, such as quark masses or \(B_K\), it is essential that contact with continuum perturbation theory can be established. Various different methods are used for this purpose (cf. Appendix A.3): Regularisation-independent Momentum Subtraction (RI/MOM), Schrödinger functional, direct comparison with (resummed) perturbation theory. Irrespective of the particular method used, the uncertainty associated with the choice of intermediate renormalisation scales in the construction of physical observables must be brought under control. This is best achieved by performing comparisons between non-perturbative and perturbative running over a reasonably broad range of scales. These comparisons were initially only made in the Schrödinger functional (SF) approach, but they are now also being performed in RI/MOM schemes. We mark the data for which information about non-perturbative running checks is available and give some details, but we do not attempt to translate this into a colour-code.

The pion mass plays an important rôle in the criteria relevant for chiral extrapolation and finite volume. For some of the regularisations used, however, it is not a trivial matter to identify this mass. In the case of twisted-mass fermions, discretisation effects give rise to a mass difference between charged and neutral pions even when the up- and down-quark masses are equal, with the charged pion being the heavier of the two. The discussion of the twisted-mass results presented in the following sections assumes that the artificial isospin-breaking effects which occur in this regularisation are under control. In addition, we assume that the mass of the charged pion may be used when evaluating the chiral-extrapolation and finite-volume criteria. In the case of staggered fermions, discretisation effects give rise to several light states with the quantum numbers of the pion.Footnote 4 The mass splitting among these “taste” partners represents a discretisation effect of \({\mathcal {O}}(a^2)\), which can be significant at big lattice spacings but shrinks as the spacing is reduced. In the discussion of the results obtained with staggered quarks given in the following sections, we assume that these artefacts are under control. When evaluating the chiral-extrapolation criteria, we conservatively identify \(M_{\pi ,\mathrm{min}}\) with the root-mean square (RMS) of the mass of all taste partners. These masses are also used in Sects. 4 and 6 when evaluating the finite-volume criteria, while in Sects. 3, 5, 7 and 8, a more stringent finite-volume criterion is applied: \(M_{\pi ,\mathrm{min}}\) is identified with the mass of the lightest state.

2.1.2 Heavy-quark physics

This subsection discusses the criteria adopted for the heavy-quark quantities included in this review, characterised by non-zero charm and bottom quantum numbers. There are several different approaches to treating heavy quarks on the lattice, each with their own issues and considerations. In general all \(b\)-quark methods rely on the use of Effective Field Theory (EFT) at some point in the computation, either via direct simulation of the EFT, use of the EFT to estimate the size of cutoff errors, or use of the EFT to extrapolate from the simulated lattice quark mass up to the physical \(b\)-quark mass. Some simulations of charm-quark quantities use the same heavy-quark methods as for bottom quarks, but there are also computations that use improved light-quark actions to simulate charm quarks. Hence, with some methods and for some quantities, truncation effects must be considered together with discretisation errors. With other methods, discretisation errors are more severe for heavy-quark quantities than for the corresponding light-quark quantities.

In order to address these complications, we add a new heavy-quark treatment category to the ratings system. The purpose of this criterion is to provide a guideline for the level of action and operator improvement needed in each approach to make reliable calculations possible, in principle. In addition, we replace the rating criteria for the continuum extrapolations of Sect. 2.1.1 with a new empirical approach based on the size of observed discretisation errors in the lattice simulation data. This accounts for the fact that whether discretisation and truncation effects in a given calculation are sufficiently small as to be controllable depends not only on the range of lattice spacings used in the simulations, but also on the simulated heavy-quark masses and on the level of action and operator improvement. For the other categories, we adopt the same strict criteria as in Sect. 2.1.1, with one minor modification, as explained below.

\(\bullet \) Heavy-quark treatment

  • A description of the different approaches to treating heavy quarks on the lattice is given in Appendix A.1.3 including a discussion of the associated discretisation, truncation, and matching errors. For truncation errors we use HQET power counting throughout, since this review is focussed on heavy quark quantities involving \(B\) and \(D\) mesons. Here we describe the criteria for how each approach must be implemented in order to receive an acceptable () rating for both the heavy quark actions and the weak operators. Heavy-quark implementations without the level of improvement described below are rated not acceptable (). The matching is evaluated together with renormalisation, using the renormalisation criteria described in Sect. 2.1.1. We emphasise that the heavy-quark implementations rated as acceptable and described below have been validated in a variety of ways, such as via phenomenological agreement with experimental measurements, consistency between independent lattice calculations, and numerical studies of truncation errors. These tests are summarised in Sect. 8.

Relativistic heavy quark actions:

  • at least tree-level \(O(a)\) improved action and weak operators

    This is similar to the requirements for light quark actions. All current implementations of relativistic heavy quark actions satisfy these criteria.

    NRQCD:

  • tree-level matched through \(O(1/m_{h})\) and improved through \(O(a^2)\)

    The current implementations of NRQCD satisfy these criteria, and also include tree-level corrections of \(O(1/m_{h}^2)\) in the action.

    HQET:

  • tree-level matched through \(O(1/m_{h})\) with discretisation errors starting at \(O(a^2)\)

    The current implementation of HQET by the ALPHA collaboration satisfies these criteria with an action and weak operators that are non-perturbatively matched through \(O(1/m_{h})\). Calculations that exclusively use a static-limit action do not satisfy theses criteria, since the static-limit action, by definition, does not include \(1/m_{h}\) terms. However, for SU(3)-breaking ratios such as \(\xi \) and \(f_{B_{s}}/f_B\) truncation errors start at \(O((m_{s} - m_{d})/m_{h})\). We therefore consider lattice calculations of such ratios that use a static-limit action to still have controllable truncation errors.

    Light-quark actions for heavy quarks:

  • discretisation errors starting at \(O(a^2)\) or higher This applies to calculations that use the tmWilson action, a non-perturbatively improved Wilson action, or the HISQ action for charm quark quantities. It also applies to calculations that use these light quark actions in the charm region and above together with either the static limit or with an HQET-inspired extrapolation to obtain results at the physical \(b\) quark mass. In these cases, the continuum-extrapolation criteria must be applied to the entire range of heavy quark masses used in the calculation.

\(\bullet \) Continuum extrapolation:

  • First we introduce the following definitions:

    $$\begin{aligned} D(a) = \frac{Q(a) - Q(0)}{Q(a)}, \end{aligned}$$
    (1)

    where \(Q(a)\) denotes the central value of quantity \(Q\) obtained at lattice spacing \(a\) and \(Q(0)\) denotes the continuum extrapolated value. \(D(a)\) is a measure of how far the continuum extrapolated result is from the lattice data. We evaluate this quantity on the smallest lattice spacing used in the calculation, \(a_\mathrm{min}\).

    $$\begin{aligned} \delta (a) = \frac{Q(a) - Q(0)}{\sigma _Q}, \end{aligned}$$
    (2)

    where \(\sigma _Q\) is the combined statistical and systematic (due to the continuum extrapolation) error. \(\delta (a)\) is a measure of how well the continuum-extrapolated result agrees with the lattice data within the statistical and systematic errors of the calculation. Again, we evaluate this quantity on the smallest lattice spacing used in the calculation, \(a_\mathrm{min}\).

  • (i) Three or more lattice spacings,

    1. (ii)

      \(a^2_\mathrm{max} / a^2_\mathrm{min} \ge 2\),

    2. (iii)

      \(D(a_\mathrm{min}) \le 2\,\%\), and

    3. (iv)

      \(\delta (a_\mathrm{min}) \le 1\)

  • (i) Two or more lattice spacings,

    1. (ii)

      \(a^2_\mathrm{max} / a^2_\mathrm{min} \ge 1.4\),

    2. (iii)

      \(D(a_\mathrm{min}) \le 10\,\%\),

    3. (iv)

      \(\delta (a_\mathrm{min}) \le 2\),

  • otherwise.

    For the time being, these new criteria for the quality of the continuum extrapolation have only been adopted for the heavy-quark quantities, but their use may be extended to all FLAG quantities in future reviews.

\(\bullet \) Finite-volume:

  • \(M_{\pi ,\mathrm{min}} L \gtrsim 3.7\) or two volumes at fixed parameters

  • \(M_{\pi ,\mathrm{min}} L \gtrsim 3\)

  • otherwise

Here the boundary between green star and open circle is slightly relaxed compared to that in Sect. 2.1.1 to account for the fact that heavy-quark quantities are less sensitive to this systematic error than light-quark quantities. A rating requires an estimate of the finite-volume error either by analysing data on two or more physical volumes (with all other parameters fixed) or by using finite-volume chiral perturbation theory. In the case of staggered sea quarks, \(M_{\pi ,\mathrm{min}}\) refers to the lightest (taste Goldstone) pion mass.

2.2 Averages and estimates

For many observables there are enough independent lattice calculations of good quality that it makes sense to average them and propose such an average as the best current lattice number. In order to decide whether this is true for a certain observable, we rely on the colour coding. We restrict the averages to data for which the colour code does not contain any red tags. In some cases, the averaging procedure nevertheless leads to a result which in our opinion does not cover all uncertainties. This is related to the fact that procedures for estimating errors and the resulting conclusions necessarily have an element of subjectivity, and would vary between groups even with the same data set. In order to stay on the conservative side, we may replace the average by an estimate (or a range), which we consider as a fair assessment of the knowledge acquired on the lattice at present. This estimate is not obtained with a prescribed mathematical procedure, but it is based on a critical analysis of the available information.

There are two other important criteria which also play a role in this respect, but which cannot be colour coded, because a systematic improvement is not possible. These are: (i) the publication status, and (ii) the number of flavours \(N_\mathrm{f}\). As far as the former criterion is concerned, we adopt the following policy: we average only results which have been published in peer-reviewed journals, i.e. they have been endorsed by referee(s). The only exception to this rule consists in obvious updates of previously published results, typically presented in conference proceedings. Such updates, which supersede the corresponding results in the published papers, are included in the averages. Nevertheless, all results are listed and their publication status is identified by the following symbols:

\(\bullet \) Publication status:

  • A published or plain update of published results

  • P preprint

  • C conference contribution

Note that updates of earlier results rely, at least partially, on the same gauge-field configuration ensembles. For this reason, we do not average updates with earlier results. In the present edition, the publication status on November 30, 2013 is relevant. If the paper appeared in print after that date this is accounted for in the bibliography, but it does not affect the averages.

In this review we present results from simulations with \(N_\mathrm{f}=2\), \(N_\mathrm{f}=2+1\) and \(N_\mathrm{f}=2+1+1\) (for \( r_0 \Lambda _{\overline{\mathrm{MS}}}\) also with \(N_\mathrm{f}=0\)). We are not aware of an a priori way to quantitatively estimate the difference between results produced in simulations with a different number of dynamical quarks. We therefore average results at fixed \(N_\mathrm{f}\) separately; averages of calculations with different \(N_\mathrm{f}\) will not be provided.

To date, no significant differences between results with different values of \(N_\mathrm{f}\) have been observed. In the future, as the accuracy and the control over systematic effects in lattice calculations will increase, it will hopefully be possible to see a difference between \(N_\mathrm{f}= 2\) and \(N_\mathrm{f}= 2 + 1\) calculations and so determine the size of the Zweig-rule violations related to strange quark loops. This is a very interesting issue per se, and one which can be quantitatively addressed only with lattice calculations.

2.3 Averaging procedure and error analysis

In [1], the FLAG averages and their errors were estimated through the following procedure: Having added in quadrature statistical and systematic errors for each individual result, we obtained their weighted \(\chi ^2\) average. This was our central value. If the fit was of good quality (\(\chi _\mathrm{min}^2/\hbox {dof} \le 1\)), we calculated the net uncertainty \(\delta \) from \(\chi ^2 = \chi _\mathrm{min}^2 + 1\); otherwise, we inflated the result obtained in this way by the factor \(S = \sqrt{(}\chi ^2/\hbox {dof})\). Whenever this \(\chi ^2\) minimisation procedure resulted in a total error which was smaller than the smallest systematic error of any individual lattice result, we assigned the smallest systematic error of that result to the total systematic error in the average.

One of the problems arising when forming such averages is that not all of the data sets are independent; in fact, some rely on the same ensembles. In particular, the same gauge-field configurations, produced with a given fermion discretisation, are often used by different research teams with different valence quark lattice actions, obtaining results which are not really independent. In the present paper we have modified our averaging procedure, in order to account for such correlations. To start with, we examine error budgets for individual calculations and look for potentially correlated uncertainties. Specific problems encountered in connection with correlations between different data sets are commented in the text. If there is any reason to believe that a source of error is correlated between two calculations, a 100 % correlation is assumed. We then obtain the central value from a \(\chi ^2\) weighted average, evaluated by adding statistical and systematic errors in quadrature (just as in Ref. [1]): for a set of individual measurements \(x_i\) with error \(\sigma _i\) and correlation matrix \(C_{ij}\), central value and error of the average are given by

$$\begin{aligned} x_\mathrm{average}&= \sum _i x_i\, \omega _i, \quad \omega _i = \dfrac{\sigma _i^{-2}}{\sum _j\sigma _j^{-2}},\end{aligned}$$
(3)
$$\begin{aligned} \sigma ^2_\mathrm{average}&= \sum _{i,j} \omega _i \,\omega _j \,C_{ij}. \end{aligned}$$
(4)

The correlation matrix for the set of correlated lattice results is estimated with Schmelling’s prescription [16]. When necessary, the statistical and systematic error bars are stretched by a factor \(S\), as specified in the previous paragraph.

3 Masses of the light quarks

Quark masses are fundamental parameters of the Standard Model. An accurate determination of these parameters is important for both phenomenological and theoretical applications. The charm and bottom masses, for instance, enter the theoretical expressions of several cross sections and decay rates in heavy-quark expansions. The up-, down- and strange-quark masses govern the amount of explicit chiral symmetry breaking in QCD. From a theoretical point of view, the values of quark masses provide information about the flavour structure of physics beyond the Standard Model. The Review of Particle Physics of the Particle Data Group contains a review of quark masses [17], which covers light as well as heavy flavours. The present summary only deals with the light-quark masses (those of the up, down and strange quarks), but it discusses the lattice results for these in more detail.

Quark masses cannot be measured directly with experiment because quarks cannot be isolated, as they are confined inside hadrons. On the other hand, quark masses are free parameters of the theory and, as such, cannot be obtained on the basis of purely theoretical considerations. Their values can only be determined by comparing the theoretical prediction for an observable, which depends on the quark mass of interest, with the corresponding experimental value. What makes light-quark masses particularly difficult to determine is the fact that they are very small (for the up and down) or small (for the strange) compared to typical hadronic scales. Thus, their impact on typical hadronic observables is minute and it is difficult to isolate their contribution accurately.

Fortunately, the spontaneous breaking of SU(3)\(_L\otimes \)SU(3)\(_R\) chiral symmetry provides observables which are particularly sensitive to the light-quark masses: the masses of the resulting Nambu–Goldstone bosons (NGB), i.e. pions, kaons and etas. Indeed, the Gell-Mann–Oakes–Renner relation [18] predicts that the squared mass of a NGB is directly proportional to the sum of the masses of the quark and antiquark which compose it, up to higher-order mass corrections. Moreover, because these NGBs are light and are composed of only two valence particles, their masses have a particularly clean statistical signal in lattice-QCD calculations. In addition, the experimental uncertainties on these meson masses are negligible.

Three flavour QCD has four free parameters: the strong coupling, \(\alpha _\mathrm{s}\) (alternatively \(\Lambda _\mathrm{QCD}\)) and the up, down and strange quark masses, \(m_{u}\), \(m_{d}\) and \(m_{s}\). However, present day lattice calculations are often performed in the isospin limit, and the up and down quark masses (especially those in the sea) usually get replaced by a single parameter: the isospin-averaged up- and down-quark mass, \(m_{ud}=\frac{1}{2}(m_{u}+m_{d})\). A lattice determination of these parameters requires two steps:

  1. 1.

    Calculations of three experimentally measurable quantities are used to fix the three bare parameters. As already discussed, NGB masses are particularly appropriate for fixing the light-quark masses. Another observable, such as the mass of a member of the baryon octet, can be used to fix the overall scale. It is important to note that until recently, most calculations were performed at values of \(m_{ud}\) which were still substantially larger than its physical value, typically four times as large. Reaching the physical up- and down-quark mass point required a significant extrapolation. This situation is changing fast. The PACS-CS [1921] and BMW [22, 23] calculations were performed with masses all the way down to their physical value (and even below in the case of BMW), albeit in very small volumes for PACS-CS. More recently, MILC [24] and RBC/UKQCD [25] have also extended their simulations almost down to the physical point, by considering pions with \(M_\pi \gtrsim 170\,\mathrm{MeV}\).Footnote 5 Regarding the strange quark, modern simulations can easily include them with masses that bracket its physical value, and only interpolations are needed.

  2. 2.

    Renormalisations of these bare parameters must be performed to relate them to the corresponding cutoff-independent, renormalised parameters.Footnote 6 These are short-distance calculations, which may be performed perturbatively. Experience shows that one-loop calculations are unreliable for the renormalisation of quark masses: usually at least two loops are required to have trustworthy results. Therefore, it is best to perform the renormalisations non-perturbatively to avoid potentially large perturbative uncertainties due to neglected higher-order terms. However, we will include in our averages one-loop results which carry a solid estimate of the systematic uncertainty due to the truncation of the series.

Of course, in quark mass ratios the renormalisation factor cancels, so that this second step is no longer relevant.

3.1 Contributions from the electromagnetic interaction

As mentioned in Sect. 2.1, the present review relies on the hypothesis that, at low energies, the Lagrangian \(\mathcal{L}_{\mathrm{QCD}}+\mathcal{L}_{\mathrm{QED}}\) describes nature to a high degree of precision. Moreover, we assume that, at the accuracy reached by now and for the quantities discussed here, the difference between the results obtained from simulations with three dynamical flavours and full QCD is small in comparison with the quoted systematic uncertainties. This will soon no longer be the case. The electromagnetic (e.m.) interaction, on the other hand, cannot be ignored. Quite generally, when comparing QCD calculations with experiment, radiative corrections need to be applied. In lattice simulations, where the QCD parameters are fixed in terms of the masses of some of the hadrons, the electromagnetic contributions to these masses must be accounted for.Footnote 7

The electromagnetic interaction plays a crucial role in determinations of the ratio \(m_{u}/m_{d}\), because the isospin-breaking effects generated by this interaction are comparable to those from \(m_{u}\ne m_{d}\) (see Sect. 3.4). In determinations of the ratio \(m_{s}/m_{ud}\), the electromagnetic interaction is less important, but at the accuracy reached, it cannot be neglected. The reason is that, in the determination of this ratio, the pion mass enters as an input parameter. Because \(M_\pi \) represents a small symmetry-breaking effect, it is rather sensitive to the perturbations generated by QED.

We distinguish the physical mass \(M_P\), \(P\in \{\pi ^+, \pi ^0\), \(K^+\), \(K^0\}\), from the mass \(\hat{M}_P\) within QCD alone. The e.m. self-energy is the difference between the two, \(M_P^\gamma \equiv M_P-\hat{M}_P\). Because the self-energy of the Nambu–Goldstone bosons diverges in the chiral limit, it is convenient to replace it by the contribution of the e.m. interaction to the square of the mass,

$$\begin{aligned} \Delta _{P}^\gamma \equiv M_P^2-\hat{M}_P^2= 2\,M_P M_P^\gamma +O(e^4). \end{aligned}$$
(5)

The main effect of the e.m. interaction is an increase in the mass of the charged particles, generated by the photon cloud that surrounds them. The self-energies of the neutral ones are comparatively small, particularly for the Nambu–Goldstone bosons, which do not have a magnetic moment. Dashen’s theorem [31] confirms this picture, as it states that, to leading order (LO) of the chiral expansion, the self-energies of the neutral NGBs vanish, while the charged ones obey \(\Delta _{K^+}^\gamma = \Delta _{\pi ^+}^\gamma \). It is convenient to express the self-energies of the neutral particles as well as the mass difference between the charged and neutral pions within QCD in units of the observed mass difference, \(\Delta _\pi \equiv M_{\pi ^+}^2-M_{\pi ^0}^2\):

$$\begin{aligned} \Delta _{\pi ^0}^\gamma \equiv \epsilon _{\pi ^0}\,\Delta _\pi ,\Delta _{K^0}^\gamma \!\equiv \! \epsilon _{K^0}\,\Delta _\pi ,\hat{M}_{\pi ^+}^2\!-\! \hat{M}_{\pi ^0}^2\equiv \epsilon _{m}\,\Delta _\pi .\end{aligned}$$
(6)

In this notation, the self-energies of the charged particles are given by

$$\begin{aligned}&\Delta _{\pi ^+}^\gamma =(1+\epsilon _{\pi ^0}-\epsilon _{m})\,\Delta _\pi ,\nonumber \\&\quad \Delta _{K^+}^\gamma =(1+\epsilon +\epsilon _{K^0}-\epsilon _{m})\,\Delta _\pi ,\end{aligned}$$
(7)

where the dimensionless coefficient \(\epsilon \) parameterises the violation of Dashen’s theorem,Footnote 8

$$\begin{aligned} \Delta _{K^+}^\gamma -\Delta _{K^0}^\gamma - \Delta _{\pi ^+}^\gamma +\Delta _{\pi ^0}^\gamma \equiv \epsilon \,\Delta _\pi .\end{aligned}$$
(8)

Any determination of the light-quark masses based on a calculation of the masses of \(\pi ^+,K^+\) and \(K^0\) within QCD requires an estimate for the coefficients \(\epsilon \), \(\epsilon _{\pi ^0}\), \(\epsilon _{K^0}\) and \(\epsilon _{m}\).

The first determination of the self-energies on the lattice was carried out by Duncan et al. [33]. Using the quenched approximation, they arrived at \(M_{K^+}^\gamma -M_{K^0}^\gamma = 1.9\,\hbox {MeV}\). Actually, the parameterisation of the masses given in that paper yields an estimate for all but one of the coefficients introduced above (since the mass splitting between the charged and neutral pions in QCD is neglected, the parameterisation amounts to setting \(\epsilon _{m}=0\) ab initio). Evaluating the differences between the masses obtained at the physical value of the electromagnetic coupling constant and at \(e=0\), we obtain \(\epsilon = 0.50(8)\), \(\epsilon _{\pi ^0} = 0.034(5)\) and \(\epsilon _{K^0} = 0.23(3)\). The errors quoted are statistical only: an estimate of lattice systematic errors is not possible from the limited results of Duncan et al. [33]. The result for \(\epsilon \) indicates that the violation of Dashen’s theorem is sizeable: according to this calculation, the non-leading contributions to the self-energy difference of the kaons amount to 50 % of the leading term. The result for the self-energy of the neutral pion cannot be taken at face value, because it is small, comparable to the neglected mass difference \(\hat{M}_{\pi ^+}-\hat{M}_{\pi ^0}\). To illustrate this, we note that the numbers quoted above are obtained by matching the parameterisation with the physical masses for \(\pi ^0\), \(K^+\) and \(K^0\). This gives a mass for the charged pion that is too high by 0.32 MeV. Tuning the parameters instead such that \(M_{\pi ^+}\) comes out correctly, the result for the self-energy of the neutral pion becomes larger: \(\epsilon _{\pi ^0}=0.10(7)\) where, again, the error is statistical only.

In an update of this calculation by the RBC collaboration [34] (RBC 07), the electromagnetic interaction is still treated in the quenched approximation, but the strong interaction is simulated with \(N_\mathrm{f}=2\) dynamical quark flavours. The quark masses are fixed with the physical masses of \(\pi ^0\), \(K^+\) and \(K^0\). The outcome for the difference in the electromagnetic self-energy of the kaons reads \(M_{K^+}^\gamma -M_{K^0}^\gamma = 1.443(55)\,\hbox {MeV}\). This corresponds to a remarkably small violation of Dashen’s theorem. Indeed, a recent extension of this work to \(N_\mathrm{f}=2+1\) dynamical flavours [32] leads to a significantly larger self-energy difference: \(M_{K^+}^\gamma -M_{K^0}^\gamma = 1.87(10)\,\hbox {MeV}\), in good agreement with the estimate of Eichten et al. Expressed in terms of the coefficient \(\epsilon \) that measures the size of the violation of Dashen’s theorem, it corresponds to \(\epsilon =0.5(1)\).

The input for the electromagnetic corrections used by MILC is specified in [35]. In their analysis of the lattice data, \(\epsilon _{\pi ^0}\), \(\epsilon _{K^0}\) and \(\epsilon _{m}\) are set equal to zero. For the remaining coefficient, which plays a crucial role in determinations of the ratio \(m_{u}/m_{d}\), the very conservative range \(\epsilon =1\pm 1\) was used in MILC 04 [36], while in more recent work, in particular in MILC 09 [15] and MILC 09A [37], this input is replaced by \(\epsilon =1.2\pm 0.5\), as suggested by phenomenological estimates for the corrections to Dashen’s theorem [38, 39]. Results of an evaluation of the electromagnetic self-energies based on \(N_\mathrm{f}=2+1\) dynamical quarks in the QCD sector and on the quenched approximation in the QED sector are also reported by MILC [4042]. Their preliminary result is \(\bar{\epsilon }=0.65(7)(14)(10)\), where the first error is statistical, the second systematic, and the third a separate systematic for the combined chiral and continuum extrapolation. The estimate of the systematic error does not yet include finite-volume effects. With the estimate for \(\epsilon _{m}\) given in (9), this result corresponds to \(\epsilon = 0.62(7)(14)(10)\). Similar preliminary results were previously reported by the BMW collaboration in conference proceedings [43, 44].

The RM123 collaboration employs a new technique to compute e.m. shifts in hadron masses in two-flavour QCD: the effects are included at leading order in the electromagnetic coupling \(\alpha \) through simple insertions of the fundamental electromagnetic interaction in quark lines of relevant Feynman graphs [45]. They find \(\epsilon =0.79(18)(18)\) where the first error is statistical and the second is the total systematic error resulting from chiral, finite-volume, discretisation, quenching and fitting errors all added in quadrature.

The effective Lagrangian that governs the self-energies to next-to-leading order (NLO) of the chiral expansion was set up in [46]. The estimates in [38, 39] are obtained by replacing QCD with a model, matching this model with the effective theory and assuming that the effective coupling constants obtained in this way represent a decent approximation to those of QCD. For alternative model estimates and a detailed discussion of the problems encountered in models based on saturation by resonances, see [4749]. In the present review of the information obtained on the lattice, we avoid the use of models altogether.

There is an indirect phenomenological determination of \(\epsilon \), which is based on the decay \(\eta \rightarrow 3\pi \) and does not rely on models. The result for the quark mass ratio \(Q\), defined in (24) and obtained from a dispersive analysis of this decay, implies \(\epsilon = 0.70(28)\) (see Sect. 3.4). While the values found in older lattice calculations [3234] are a little less than one standard deviation lower, the most recent determinations [4045, 50], though still preliminary, are in excellent agreement with this result and have significantly smaller error bars. However, even in the more recent calculations, e.m. effects are treated in the quenched approximation. Thus, we choose to quote \(\epsilon = 0.7(3)\), which is essentially the \(\eta \rightarrow 3\pi \) result and covers generously the range of post 2010 lattice results. Note that this value has an uncertainty which is reduced by about 40 % compared to the result quoted in the first edition of the FLAG review [1].

We add a few comments concerning the physics of the self-energies and then specify the estimates used as an input in our analysis of the data. The Cottingham formula [51] represents the self-energy of a particle as an integral over electron scattering cross sections; elastic as well as inelastic reactions contribute. For the charged pion, the term due to elastic scattering, which involves the square of the e.m. form factor, makes a substantial contribution. In the case of the \(\pi ^0\), this term is absent, because the form factor vanishes on account of charge conjugation invariance. Indeed, the contribution from the form factor to the self-energy of the \(\pi ^+\) roughly reproduces the observed mass difference between the two particles. Furthermore, the numbers given in [5254] indicate that the inelastic contributions are significantly smaller than the elastic contributions to the self-energy of the \(\pi ^+\). The low energy theorem of Das et al. [55] ensures that, in the limit \(m_{u},m_{d}\rightarrow 0\), the e.m. self-energy of the \(\pi ^0\) vanishes, while the one of the \(\pi ^+\) is given by an integral over the difference between the vector and axial-vector spectral functions. The estimates for \(\epsilon _{\pi ^0}\) obtained in [33] are consistent with the suppression of the self-energy of the \(\pi ^0\) implied by chiral SU(2) \(\times \) SU(2). In our opinion, \(\epsilon _{\pi ^0}=0.07(7)\) is a conservative estimate for this coefficient. The self-energy of the \(K^0\) is suppressed less strongly, because it remains different from zero if \(m_{u}\) and \(m_{d}\) are taken massless and only disappears if \(m_{s}\) is turned off as well. Note also that, since the e.m. form factor of the \(K^0\) is different from zero, the self-energy of the \(K^0\) does pick up an elastic contribution. The lattice result for \(\epsilon _{K^0}\) indicates that the violation of Dashen’s theorem is smaller than in the case of \(\epsilon \). In the following, we use \(\epsilon _{K^0}=0.3(3)\).

Finally, we consider the mass splitting between the charged and neutral pions in QCD. This effect is known to be very small, because it is of second order in \(m_{u}-m_{d}\). There is a parameter-free prediction, which expresses the difference \(\hat{M}_{\pi ^+}^2-\hat{M}_{\pi ^0}^2\) in terms of the physical masses of the pseudoscalar octet and is valid to NLO of the chiral perturbation series. Numerically, the relation yields \(\epsilon _{m}=0.04\) [56], indicating that this contribution does not play a significant role at the present level of accuracy. We attach a conservative error also to this coefficient: \(\epsilon _{m}=0.04(2)\). The lattice result for the self-energy difference of the pions, reported in [32], \(M_{\pi ^+}^\gamma -M_{\pi ^0}^\gamma = 4.50(23)\,\hbox {MeV}\), agrees with this estimate: expressed in terms of the coefficient \(\epsilon _{m}\) that measures the pion mass splitting in QCD, the result corresponds to \(\epsilon _{m}=0.04(5)\). The corrections of next-to-next-to-leading order (NNLO) have been worked out [57], but the numerical evaluation of the formulae again meets with the problem that the relevant effective coupling constants are not reliably known.

In summary, we use the following estimates for the e.m. corrections:

$$\begin{aligned} \epsilon ={ 0.7(3)} ,\ \epsilon _{\pi ^0}=0.07(7),\ \epsilon _{K^0}=0.3(3),\ \epsilon _{m}=0.04(2).\nonumber \\ \end{aligned}$$
(9)

While the range used for the coefficient \(\epsilon \) affects our analysis in a significant way, the numerical values of the other coefficients only serve to set the scale of these contributions. The range given for \(\epsilon _{\pi ^0}\) and \(\epsilon _{K^0}\) may be overly generous, but because of the exploratory nature of the lattice determinations, we consider it advisable to use a conservative estimate.

Treating the uncertainties in the four coefficients as statistically independent and adding errors in quadrature, the numbers in equation (9) yield the following estimates for the e.m. self-energies,

$$\begin{aligned}&M_{\pi ^+}^\gamma = 4.7(3)\, \hbox {MeV},\ M_{\pi ^0}^\gamma = 0.3(3)\,\hbox {MeV} ,\nonumber \\&\quad M_{\pi ^+}^\gamma -M_{\pi ^0}^\gamma =4.4(1)\, \hbox {MeV},\nonumber \\&M_{K^+}^\gamma = 2.5(5)\,\hbox {MeV},\ M_{K^0}^\gamma =0.4(4)\,\hbox {MeV},\nonumber \\&\quad M_{K^+}^\gamma -M_{K^0}^\gamma = 2.1(4)\, \hbox {MeV}, \end{aligned}$$
(10)

and for the pion and kaon masses occurring in the QCD sector of the Standard Model,

$$\begin{aligned}&\hat{M}_{\pi ^+}= 134.8(3)\, \hbox {MeV},\ \hat{M}_{\pi ^0} = 134.6(3)\,\hbox {MeV} ,\nonumber \\&\quad \hat{M}_{\pi ^+} -\hat{M}_{\pi ^0}=0.2(1)\, \hbox {MeV},\nonumber \\&\hat{M}_{K^+}= 491.2(5)\,\hbox {MeV},\ \hat{M}_{K^0} =497.2(4)\,\hbox {MeV},\nonumber \\&\quad \hat{M}_{K^+}-\hat{M}_{K^0}=-6.1(4)\, \hbox {MeV}.\end{aligned}$$
(11)

The self-energy difference between the charged and neutral pion involves the same coefficient \(\epsilon _{m}\) that describes the mass difference in QCD—this is why the estimate for \( M_{\pi ^+}^\gamma -M_{\pi ^0}^\gamma \) is so sharp.

3.2 Pion and kaon masses in the isospin limit

As mentioned above, most of the lattice calculations concerning the properties of the light mesons are performed in the isospin limit of QCD (\(m_{u}-m_{d}\rightarrow 0\) at fixed \(m_{u}+m_{d}\)). We denote the pion and kaon masses in that limit by \(\overline{M}_{\pi }\) and \(\overline{M}_{K}\), respectively. Their numerical values can be estimated as follows. Since the operation \(u\leftrightarrow d\) interchanges \(\pi ^+\) with \(\pi ^-\) and \(K^+\) with \(K^0\), the expansion of the quantities \(\hat{M}_{\pi ^+}^2\) and \(\frac{1}{2}(\hat{M}_{K^+}^2+\hat{M}_{K^0}^2)\) in powers of \(m_{u}-m_{d}\) only contains even powers. As shown in [58], the effects generated by \(m_{u}-m_{d}\) in the mass of the charged pion are strongly suppressed: the difference \(\hat{M}_{\pi ^+}^2-\overline{M}_{\pi }^{\,2}\) represents a quantity of \(O[(m_{u}-m_{d})^2(m_{u}+m_{d})]\) and is therefore small compared to the difference \(\hat{M}_{\pi ^+}^2-\hat{M}_{\pi ^0}^2\), for which an estimate was given above. In the case of \(\frac{1}{2}(\hat{M}_{K^+}^2+\hat{M}_{K^0}^2)-\overline{M}_{K}^{\,2}\), the expansion does contain a contribution at NLO, determined by the combination \(2L_8-L_5\) of low-energy constants, but the lattice results for that combination show that this contribution is very small, too. Numerically, the effects generated by \(m_{u}-m_{d}\) in \(\hat{M}_{\pi ^+}^2\) and in \(\frac{1}{2}(\hat{M}_{K^+}^2+\hat{M}_{K^0}^2)\) are negligible compared to the uncertainties in the electromagnetic self-energies. The estimates for these given in Eq. (11) thus imply

$$\begin{aligned}&\overline{M}_{\pi }= \hat{M}_{\pi ^+}=134.8(3)\,\mathrm{MeV},\nonumber \\&\overline{M}_{K}= \sqrt{\frac{1}{2}\left( \hat{M}_{K^+}^2+\hat{M}_{K^0}^2\right) }= 494.2(4)\,\mathrm{MeV}. \end{aligned}$$
(12)

This shows that, for the convention used above to specify the QCD sector of the Standard Model, and within the accuracy to which this convention can currently be implemented, the mass of the pion in the isospin limit agrees with the physical mass of the neutral pion: \(\overline{M}_{\pi }-M_{\pi 0}=-0.2(3)\) MeV.

3.3 Lattice determination of \(m_{s}\) and \(m_{ud}\)

We now turn to a review of the lattice calculations of the light-quark masses and begin with \(m_{s}\), the isospin-averaged up- and down-quark mass, \(m_{ud}\), and their ratio. Most groups quote only \(m_{ud}\), not the individual up- and down-quark masses. We then discuss the ratio \(m_{u}/m_{d}\) and the individual determination of \(m_{u}\) and \(m_{d}\).

Quark masses have been calculated on the lattice since the mid-1990s. However, early calculations were performed in the quenched approximation, leading to unquantifiable systematics. Thus in the following, we only review modern, unquenched calculations, which include the effects of light sea-quarks.

Tables 2 and 3 list the results of \(N_\mathrm{f}=2\) and \(N_\mathrm{f}=2+1\) lattice calculations of \(m_{s}\) and \(m_{ud}\). These results are given in the \({\overline{\mathrm{MS}}}\) scheme at \(2\,\mathrm{GeV}\), which is standard nowadays, though some groups are starting to quote results at higher scales (e.g. [25]). The tables also show the colour-coding of the calculations leading to these results. The corresponding results for \(m_{s}/m_{ud}\) are given in Table 4. As indicated earlier in this review, we treat \(N_\mathrm{f}=2\) and \(N_\mathrm{f}=2+1\) calculations separately. The latter include the effects of a strange sea-quark, but the former do not.

Table 2 \(N_\mathrm{f}=2\) lattice results for the masses \(m_{ud}\) and \(m_{s}\) (MeV, running masses in the \({\overline{\mathrm{MS}}}\) scheme at scale 2 GeV). The significance of the colours is explained in Sect. 2. If information about non-perturbative running is available, this is indicated in the column “running”, with details given at the bottom of the table
Table 3 \(N_\mathrm{f}=2+1\) lattice results for the masses \(m_{ud}\) and \(m_{s}\) (see Table 2 for notation)
Table 4 Lattice results for the ratio \(m_{s}/m_{ud}\)

3.3.1 \(N_\mathrm{f}=2\) lattice calculations

We begin with \(N_\mathrm{f}=2\) calculations. A quick inspection of Table 2 indicates that only the most recent calculations, ALPHA 12 [59] and ETM 10B [60], control all systematic effects—the special case of Dürr 11 [61] is discussed below. Only ALPHA 12 [59], ETM 10B [60] and ETM 07 [62] really enter the chiral regime, with pion masses down to about 270 MeV for ALPHA and ETM. Because this pion mass is still quite far from the physical pion mass, ALPHA 12 refrain from determining \(m_{ud}\) and give only \(m_{s}\). All the other calculations have significantly more massive pions, the lightest being about 430 MeV, in the calculation by CP-PACS 01 [63]. Moreover, the latter calculation is performed on very coarse lattices, with lattice spacings \(a\ge 0.11\,\,{\mathrm {fm}}\) and only one-loop perturbation theory is used to renormalise the results.

ETM 10B’s [60] calculation of \(m_{ud}\) and \(m_{s}\) is an update of the earlier twisted-mass determination of ETM 07 [62]. In particular, they have added ensembles with a larger volume and three new lattice spacings, \(a = 0.054, 0.067\) and \(0.098\,\,{\mathrm {fm}}\), allowing for a continuum extrapolation. In addition, it presents analyses performed in SU(2) and \(\hbox {SU}(3) \chi \)PT.

The new ALPHA 12 [59] calculation of \(m_{s}\) is an update of ALPHA 05 [64], which pushes computations to finer lattices and much lighter pion masses. It also importantly includes a determination of the lattice spacing with the decay constant \(F_K\), whereas ALPHA 05 converted results to physical units using the scale parameter \(r_0\) [65], defined via the force between static quarks. In particular, the conversion relied on measurements of \(r_0/a\) by QCDSF/UKQCD 04 [66] which differ significantly from the new determination by ALPHA 12. As in ALPHA 05, in ALPHA 12 both non-perturbative running and non-perturbative renormalisation are performed in a controlled fashion, using Schrödinger functional methods.

The conclusion of our analysis of \(N_\mathrm{f}=2\) calculations is that the results of ALPHA 12 [59] and ETM 10B [60] (which update and extend ALPHA 05 [64] and ETM 07 [62], respectively), are the only ones to date which satisfy our selection criteria. Thus we average those two results for \(m_{s}\), obtaining 101(3) MeV. Regarding \(m_{ud}\), for which only ETM 10B [60] gives a value, we do not offer an average but simply quote ETM’s number. Because ALPHA’s result induces an increase by 7 % of our earlier average for \(m_{s}\) [1], while \(m_{ud}\) remains unchanged, our average for \(m_{s}/m_{ud}\) also increases by 7 %. For the latter, however, we retain the percent error quoted by ETM, who directly estimates this ratio, and add it in quadrature to the percent error on ALPHA’s \(m_{s}\). Thus, we quote as our estimates:

$$\begin{aligned}&N_\mathrm{f}=2 : m_{s}= 101(3) \,\hbox {MeV},\ m_{ud}= 3.6(2) \,\hbox {MeV} ,\nonumber \\&\quad \frac{m_{s}}{m_{ud}} = 28.1(1.2).\end{aligned}$$
(13)

The errors on these results are 3, 6 and 4 %, respectively. The error is smaller in the ratio than one would get from combining the errors on \(m_{ud}\) and \(m_{s}\), because statistical and systematic errors cancel in ETM’s result for this ratio, most notably those associated with renormalisation and the setting of the scale. It is worth noting that thanks to ALPHA 12 [59], the total error on \(m_{s}\) has reduced significantly, from 7 % in the last edition of our report to 3 % now. It is also interesting to remark that ALPHA 12’s [59] central value for \(m_{s}\) is about 1 \(\sigma \) larger than that of ETM 10B [60] and nearly 2 \(\sigma \) larger than our present \(N_\mathrm{f}=2+1\) determination given in (14). Moreover, this larger value for \(m_{s}\) increases our \(N_\mathrm{f}=2\) determination of \(m_{s}/m_{ud}\), making it larger than ETM 10B’s direct measurement, though compatible within errors.

We have not discussed yet the precise results of Dürr 11 [61] which satisfy our selection criteria. This is because Dürr 11 pursue an approach which is sufficiently different from the one of other calculations that we prefer not to include it in an average at this stage. Following HPQCD 09A, 10 [72, 73], the observable which they actually compute is \(m_{c}/m_{s}=11.27(30)(26)\), with an accuracy of 3.5 %. This result is about 1.5 combined standard deviations below ETM 10B’s [60] result \(m_{c}/m_{s}=12.0(3)\). \(m_{s}\) is subsequently obtained using lattice and phenomenological determinations of \(m_{c}\) which rely on perturbation theory. The value of the charm-quark mass which they use is an average of those determinations, which they estimate to be \(m_{c}(2\,\mathrm{GeV})=1.093(13)\,\mathrm{GeV}\), with a 1.2 % total uncertainty. Note that this value is consistent with the PDG average \(m_{c}(2\,\mathrm{GeV})=1.094(21)\,\mathrm{GeV}\) [74], though the latter has a larger 2.0 % uncertainty. Dürr 11’s value of \(m_{c}\) leads to \(m_{s}=97.0(2.6)(2.5)\,\mathrm{MeV}\) given in Table 2, which has a total error of 3.7 %. The use of the PDG value for \(m_{c}\) [74] would lead to a very similar result. The result for \(m_{s}\) is perfectly compatible with our estimate given in (13) and has a comparable error bar. To determine \(m_{ud}\), Dürr 11 combine their result for \(m_{s}\) with the \(N_\mathrm{f}=2+1\) calculation of \(m_{s}/m_{ud}\) of BMW 10A, 10B [22, 23] discussed below. They obtain \(m_{ud}=3.52(10)(9)\,\mathrm{MeV}\) with a total uncertainty of less than 4 %, which is again fully consistent with our estimate of (13) and its uncertainty.

3.3.2 \(N_\mathrm{f}=2+1\) lattice calculations

We turn now to \(N_\mathrm{f}=2+1\) calculations. These and the corresponding results are summarised in Tables 3 and 4. Somewhat paradoxically, these calculations are more mature than those with \(N_\mathrm{f}=2\). This is thanks, in large part, to the head start and sustained effort of MILC, who have been performing \(N_\mathrm{f}=2+1\) rooted staggered fermion calculations for the past ten or so years. They have covered an impressive range of parameter space, with lattice spacings which, today, go down to 0.045 fm and valence pion masses down to approximately 180 MeV [37]. The most recent updates, MILC 10A [75] and MILC 09A [37], include significantly more data and use two-loop renormalisation. Since these data sets subsume those of their previous calculations, these latest results are the only ones that must be kept in any world average.

Since our last report [1] the situation for \(N_\mathrm{f}=2+1\) determinations of light quarks has undergone some evolution. There are new computations by RBC/UKQCD 12 [25], PACS-CS 12 [76] and Laiho 11 [77]. Furthermore, the results of BMW 10A, 10B [22, 23] have been published and can now be included in our averages.

The RBC/UKQCD 12 [25] computation improves on the one of RBC/UKQCD 10A [78] in a number of ways. In particular it involves a new simulation performed at a rather coarse lattice spacing of 0.144 fm, but with unitary pion masses down to 171(1) MeV and valence pion masses down to 143(1) MeV in a volume of \((4.6\,\,{\mathrm {fm}})^3\), compared, respectively, to 290 MeV, 225 MeV and \((2.7\,\,{\mathrm {fm}})^3\) in RBC/UKQCD 10A. This provides them with a significantly better control over the extrapolation to physical \(M_\pi \) and to the infinite-volume limit. As before, they perform non-perturbative renormalisation and running in RI/SMOM schemes. The only weaker point of the calculation comes from the fact that two of their three lattice spacings are larger than 0.1 fm and correspond to different discretisations, while the finest is only 0.085 fm, making it difficult to convincingly claim full control over the continuum limit. This is mitigated by the fact that the scaling violations which they observe on their coarsest lattice are for many quantities small, around 5 %.

The Laiho 11 results [77] are based on MILC staggered ensembles at the lattice spacings 0.15, 0.09 and 0.06 fm, on which they propagate domain-wall quarks. Moreover, they work in volumes of up to \((4.8\,\,{\mathrm {fm}})^3\). These features give them full control over the continuum and infinite-volume extrapolations. Their lightest RMS sea pion mass is 280 MeV and their valence pions have masses down to 210 MeV. The fact that their sea pions do not enter deeply into the chiral regime penalises somewhat their extrapolation to physical \(M_\pi \). Moreover, to renormalise the quark masses, they use one-loop perturbation theory for \(Z_A/Z_S-1\) which they combine with \(Z_A\) determined non-perturbatively from the axial-vector Ward identity. Although they conservatively estimate the uncertainty associated with the procedure to be 5 %, which is the size of their largest one-loop correction, this represents a weaker point of this calculation.

The new PACS-CS 12 [76] calculation represents an important extension of the collaboration’s earlier 2010 computation [21], which already probed pion masses down to \(M_\pi \simeq 135\,\mathrm{MeV}\), i.e. down to the physical-mass point. This was achieved by reweighting the simulations performed in PACS-CS 08 [19] at \(M_\pi \simeq 160\,\mathrm{MeV}\). If adequately controlled, this procedure eliminates the need to extrapolate to the physical-mass point and, hence, the corresponding systematic error. The new calculation now applies similar reweighting techniques to include electromagnetic and \(m_{u}\ne m_{d}\) isospin-breaking effects directly at the physical pion mass. It technically adds to Blum 10 [32] and BMW’s preliminary results of [43, 44] by including these effects not only for valence but also for sea-quarks, as is also done in [86]. Further, as in PACS-CS 10 [21], renormalisation of quark masses is implemented non-perturbatively, through the Schrödinger functional method [87]. As it stands, the main drawback of the calculation, which makes the inclusion of its results in a world average of lattice results inappropriate at this stage, is that for the lightest quark mass the volume is very small, corresponding to \(LM_\pi \simeq 2.0\), a value for which finite-volume effects will be difficult to control. Another problem is that the calculation was performed at a single lattice spacing, forbidding a continuum extrapolation. Further, it is unclear at this point what might be the systematic errors associated with the reweighting procedure.

As shown by the colour-coding in Tables 3 and 4, the BMW 10A, 10B [22, 23] calculation is still the only one to have addressed all sources of systematic effects while reaching the physical up- and down-quark mass by interpolation instead of by extrapolation. Moreover, their calculation was performed at five lattice spacings ranging from 0.054 to 0.116 fm, with full non-perturbative renormalisation and running and in volumes of up to (6 fm)\(^3\) guaranteeing that the continuum limit, renormalisation and infinite-volume extrapolation are controlled. It does neglect, however, isospin-breaking effects, which are small on the scale of their error bars.

Finally we come to another calculation which satisfies our selection criteria, HPQCD 10 [73] (which updates HPQCD 09A [72]). The strange-quark mass is computed using a precise determination of the charm-quark mass, \(m_{c}(m_{c})=1.273(6)\) GeV [73, 85], whose accuracy is better than 0.5 %, and a calculation of the quark-mass ratio \(m_{c}/m_{s}=11.85(16)\) [72], which achieves a precision slightly above 1 %. The determination of \(m_{s}\) via the ratio \(m_{c}/m_{s}\) displaces the problem of lattice renormalisation in the computation of \(m_{s}\) to one of renormalisation in the continuum for the determination of \(m_{c}\). To calculate \(m_{ud}\) HPQCD 10 [73] use the MILC 09 determination of the quark-mass ratio \(m_{s}/m_{ud}\) [15].

The high precision quoted by HPQCD 10 on the strange-quark mass relies in large part on the precision reached in the determination of the charm-quark mass [73, 85]. This calculation uses an approach based on the lattice determination of moments of charm-quark pseudoscalar, vector and axial-vector correlators. These moments are then combined with four-loop results from continuum perturbation theory to obtain a determination of the charm-quark mass in the \({\overline{\mathrm{MS}}}\) scheme. In the preferred case, in which pseudoscalar correlators are used for the analysis, there are no lattice renormalisation factors required, since the corresponding axial-vector current is partially conserved in the staggered lattice formalism.

Instead of combining the result for \(m_{c}/m_{s}\) of [72] with \(m_{c}\) from [73], one can use it with the PDG [74] average \(m_{c}(m_{c})=1.275(25)\,\mathrm{GeV}\), whose error is four times as large as the one obtained by HPQCD 10. If one does so, one obtains \(m_{s}=92.3(2.2)\) in lieu of the value \(m_{s}=92.2(1.3)\) given in Table 3, thereby nearly doubling HPQCD 10’s error. Though we plan to do so in the future, we have not yet performed a review of lattice determinations of \(m_{c}\). Thus, as for the results of Dürr 11 [61] in the \(N_\mathrm{f}=2\) case, we postpone its inclusion in our final averages until we have performed an independent analysis of \(m_{c}\), emphasizing that this novel strategy for computing the light-quark masses may very well turn out to be the best way to determine them.

This discussion leaves us with three results for our final average for \(m_{s}\), those of MILC 09A [37], BMW 10A, 10B [22, 23] and RBC/UKQCD 12 [25], and the result of HPQCD 10 [73] as an important cross-check. Thus, we first check that the three other results which will enter our final average are consistent with HPQCD 10’s result. To do this we implement the averaging procedure described in Sect. 2.2 on all four results. This yields \(m_{s}=93.0(1.0)\,\mathrm{MeV}\) with a \(\chi ^2/\hbox {dof} = 3.0/3=1.0\), indicating overall consistency. Note that in making this average, we have accounted for correlations in the small statistical errors of HPQCD 10 and MILC 09A. Omitting HPQCD 10 in our final average results in an increase by 50 % of the average’s uncertainty and by 0.8 \(\sigma \) of its central value. Thus, we obtain \(m_{s}=93.8(1.5)\,\mathrm{MeV}\) with a \(\chi ^2/\hbox {dof} = 2.26/2=1.13\). When repeating the exercise for \(m_{ud}\), we replace MILC 09A by the more recent analysis reported in MILC 10A [75]. A fit of all four results yields \(m_{ud}=3.41(5)\,\mathrm{MeV}\) with a \(\chi ^2/\hbox {dof} = 2.6/3=0.9\) and including only the same three as above gives \(m_{ud}=3.42(6)\,\mathrm{MeV}\) with a \(\chi ^2/\hbox {dof} = 2.4/2=1.2\). Here the results are barely distinguishable, indicating full compatibility of all four results. Note that the outcome of the averaging procedure amounts to a determination of \(m_{s}\) and \(m_{ud}\) of 1.6 and 1.8 %, respectively.

The heavy sea-quarks affect the determination of the light-quark masses only through contributions of order \(1/m_{c}^2\), which moreover are suppressed by the Okubo–Zweig–Iizuka-rule. We expect these contributions to be small. However, note that the effect of omitted sea quarks on a given quantity is not uniquely defined: the size of the effect depends on how the theories with and without these flavours are matched. One way to set conventions is to ensure that the bare parameters common to both theories are fixed by the same physical observables and that the renormalisations are performed in the same scheme and at the same scale, with the appropriate numbers of flavours.

An upper bound on the heavy-quark contributions can be obtained by looking at the presumably much larger effect associated with omitting the strange quark in the sea. Within errors, the average value \(m_{ud} = 3.42(6)\) MeV obtained above from the data with \(N_\mathrm{f} = 2+1\) agrees with the result \(m_{ud} = 3.6(2)\) MeV for \(N_\mathrm{f} = 2\) quoted in (13): assuming that the underlying calculations more or less follow the above matching prescription, the effects generated by the quenching of the strange quark in \(m_{ud}\) are within the noise. Interpreting the two results as Gaussian distributions, the probability distribution of the difference \(\Delta m_{ud} \equiv (m_{ud}|_{N_\mathrm{f}=2})- (m_{ud}|_{N_\mathrm{f}=3})\) is also Gaussian, with \(\Delta m_{ud}=0.18(21)\) MeV. The corresponding root-mean-square \(\langle \Delta m_{ud}^2\rangle ^\frac{1}{2}= 0.28\) MeV provides an upper bound for the size of the effects due to strange quark quenching; it amounts to 8 % of \(m_{ud}\). In the case of \(m_{s}\), the analogous calculation yields \(\langle \Delta m_{s}^2\rangle ^\frac{1}{2}=7.9\) MeV and thus also amounts to an upper bound of about 8 %. Taking any of these numbers as an upper bound on the omission of charm effects in the \(N_\mathrm{f}=2+1\) results is, we believe, a significant overestimate.

An underestimate of the upper bound on the sea-charm contributions to \(m_{s}\) can be obtained by transposing, to the \(s\bar{s}\) system, the perturbative, heavy quarkonium arguments put forward in [94] to determine the effect of sea charm on the \(\eta _{c}\) and \(J/\psi \) masses. An estimate using constituent quark masses [95] leads very roughly to a 0.05 % effect on \(m_{s}\), from which [95] concludes that the error on \(m_{s}\) and \(m_{ud}\) due to the omission of charm is of order 0.1 %.

One could also try to estimate the effect by analysing the relation between the parameters of QCD\(_3\) and those of full QCD in perturbation theory. The \(\beta \)- and \(\gamma \)-functions, which control the renormalisation of the coupling constants and quark masses, respectively, are known to four loops [83, 84, 96, 97]. The precision achieved in this framework for the decoupling of the \(t\)- and \(b\)-quarks is excellent, but the \(c\)-quark is not heavy enough: at the percent level, we believe that the corrections of order \(1/m_{c}^2\) cannot be neglected and the decoupling formulae of perturbation theory do not provide a reliable evaluation, because the scale \(m_{c}(m_{c})\simeq 1.28\,\mathrm{GeV}\) is too low for these formulae to be taken at face value. Consequently, the accuracy to which it is possible to identify the running masses of the light quarks of full QCD in terms of those occurring in QCD\(_3\) is limited. For this reason, it is preferable to characterise the masses \(m_{u}\), \(m_{d}\), \(m_{s}\) in terms of QCD\(_4\), where the connection with full QCD is under good control.

The role of the \(c\)-quarks in the determination of the light-quark masses will soon be studied in detail—some simulations with \(2+1+1\) dynamical quarks have already been carried out [24, 98]. For the moment, we choose to consider a crude, and hopefully reasonably conservative, upper bound on the size of the effects due to the neglected heavy quarks that can be established within the \(N_\mathrm{f}=2+1\) simulations themselves, without invoking perturbation theory. In [99] it is found that when the scale is set by \(M_\Xi \), the result for \(M_\Lambda \) agrees well with experiment within the 2.3 % accuracy of the calculation. Because of the very strong correlations between the statistical and systematic errors of these two masses, we expect the uncertainty in the difference \(M_\Xi -M_\Lambda \) to also be of order 2 %. To leading order in the chiral expansion this mass difference is proportional to \(m_{s}-m_{ud}\). Barring accidental cancellations, we conclude that the agreement of \(N_\mathrm{f}= 2+1\) calculations with experiment suggests an upper bound on the sensitivity of \(m_{s}\) to heavy sea-quarks of order 2 %.

Taking this uncertainty into account yields the following averages:

$$\begin{aligned} N_\mathrm{f}\!=\!2\!+\!1 : m_{ud} \!=\! 3.42(6)(7) ;\mathrm{MeV},\ m_{s}\!=\! 93.8(1.5)(1.9);\mathrm{MeV},\nonumber \\ \end{aligned}$$
(14)

where the first error comes from the averaging of the lattice results, and the second is the one that we add to account for the neglect of sea effects from the charm and more massive quarks. This corresponds to determinations of \(m_{ud}\) and \(m_{s}\) with a precision of and 2.6 and 2.7 %, respectively. These estimates represent the conclusions we draw from the information gathered on the lattice until now. They are shown as vertical bands in Figs. 1 and 2, together with the \(N_\mathrm{f}=2\) results (13).

Fig. 1
figure 1

Mass of the strange quark (\({\overline{\mathrm{MS}}}\) scheme, running scale 2 GeV). The central and top panels show the lattice results listed in Tables 2 and 3. For comparison, the bottom panel collects a few sum rule results and also indicates the current PDG estimate. Diamonds represent results based on perturbative renormalisation, while squares indicate that, in the relation between the lattice regularised and renormalised \({\overline{\mathrm{MS}}}\) masses, non-perturbative effects are accounted for. The black squares and the grey bands represent our estimates (13) and (14). The significance of the colours is explained in Sect. 2

Fig. 2
figure 2

Mean mass of the two lightest quarks, \(m_{ud}=\frac{1}{2}(m_{u}+m_{d})\) (for details see Fig. 1)

In the ratio \(m_{s}/m_{ud}\), one of the sources of systematic error—the uncertainties in the renormalisation factors—drops out. Also, we can compare the lattice results with the leading-order formula of \(\chi \)PT,

$$\begin{aligned} \frac{m_{s}}{m_{ud}}\mathop {=}\limits ^{\mathrm{LO}}\frac{\hat{M}_{K^+}^2+ \hat{M}_{K^0}^2-\hat{M}_{\pi ^+}^2}{\hat{M}_{\pi ^+}^2},\end{aligned}$$
(15)

which relates the quantity \(m_{s}/m_{ud}\) to a ratio of meson masses in QCD. Expressing these in terms of the physical masses and the four coefficients introduced in (6)–(8), linearizing the result with respect to the corrections and inserting the observed mass values, we obtain

$$\begin{aligned} \frac{m_{s}}{m_{ud}} \mathop {=}\limits ^{\mathrm{LO}}25.9 - 0.1\, \epsilon + 1.9\, \epsilon _{\pi ^0} - 0.1\, \epsilon _{K^0} -1.8 \,\epsilon _{m}.\end{aligned}$$
(16)

If the coefficients \(\epsilon \), \(\epsilon _{\pi ^0}\), \(\epsilon _{K^0}\) and \(\epsilon _{m}\) are set equal to zero, the right hand side reduces to the value \(m_{s}/m_{ud}=25.9\) that follows from Weinberg’s leading-order formulae for \(m_{u}/m_{d}\) and \(m_{s}/m_{d}\) [100], in accordance with the fact that these do account for the e.m. interaction at leading chiral order, and neglect the mass difference between the charged and neutral pions in QCD. Inserting the estimates (9) gives the effect of chiral corrections to the e.m. self-energies and of the mass difference between the charged and neutral pions in QCD. With these, the LO prediction in QCD becomes

$$\begin{aligned} \frac{m_{s}}{m_{ud}}\mathop {=}\limits ^{\mathrm{LO}}25.9(1).\end{aligned}$$
(17)

The quoted uncertainty does not include an estimate for the higher-order contributions, but it only accounts for the error bars in the coefficients, which is dominated by the one in the estimate given for \(\epsilon _{\pi ^0}\). The fact that the central value remains unchanged indicates that chiral corrections to the e.m. self-energies and mass-difference corrections are small in this particular quantity. However, given the high accuracy reached in lattice determinations of the ratio \(m_{s}/m_{ud}\), the uncertainties associated with e.m. corrections are no longer completely irrelevant. This is seen by comparing the 0.1 in (17) with the 0.15 in (18). Nevertheless, this uncertainty is still smaller than our \(\sim 1.\div 1.5\,\%\) upper bound on possible \(1/m_{c}^2\) corrections (Fig. 3).

Fig. 3
figure 3

Results for the ratio \(m_{s}/m_{ud}\). The upper part indicates the lattice results listed in Table 4. The lower part shows results obtained from \(\chi \)PT and sum rules, together with the current PDG estimate

The lattice results in Table 4, which satisfy our selection criteria, indicate that the corrections generated by the nonleading terms of the chiral perturbation series are remarkably small, in the range 3–10 %. Despite the fact that the SU(3)-flavour-symmetry-breaking effects in the Nambu–Goldstone boson masses are very large (\(M_K^2\simeq 13\, M_\pi ^2\)), the mass spectrum of the pseudoscalar octet obeys the SU(3) \(\times \) SU(3) formula (15) very well.

Our average for \(m_{s}/m_{ud}\) is based on the results of MILC 09A, BMW 10A, 10B and RBC/UKQCD 12—the value quoted by HPQCD 10 does not represent independent information as it relies on the result for \(m_{s}/m_{ud}\) obtained by the MILC collaboration. Averaging these results according to the precription of Sect. 2.3 gives \(m_{s}/m_{ud}=27.46(15)\) with \(\chi ^2/\hbox {dof}=0.2/2\). The fit is dominated by MILC 09A and BMW 10A, 10B. Since the errors associated with renormalisation drop out in the ratio, the uncertainties are even smaller than in the case of the quark masses themselves: the above number for \(m_{s}/m_{ud}\) amounts to an accuracy of 0.5 %.

At this level of precision, the uncertainties in the electromagnetic and strong isospin-breaking corrections are not completely negligible. The error estimate in the LO result (17) indicates the expected order of magnitude. The uncertainties in \(m_{s}\) and \(m_{ud}\) associated with the heavy sea-quarks cancel at least partly. In view of this, we ascribe a total 1.5 % uncertainty to these two sources of error. Thus, we are convinced that our final estimate,

$$\begin{aligned} N_\mathrm{f}=2+1 :\quad \frac{m_{s}}{m_{ud}}=27.46(15)(41),\end{aligned}$$
(18)

is on the conservative side, with a total 1.5 % uncertainty. It is also fully consistent with the ratio computed from our individual quark masses in (14), \(m_{s}/m_{ud}=27.6(6)\), which has a larger 2.2 % uncertainty. In (18) the first error comes from the averaging of the lattice results, and the second is the one that we add to account for the neglect of isospin-breaking and heavy sea-quark effects.

The lattice results show that the LO prediction of \(\chi \)PT in (17) receives only small corrections from higher orders of the chiral expansion: according to (18), these generate a shift of \(5.7\pm 1.5\, \%\). Our estimate does therefore not represent a very sharp determination of the higher-order contributions.

The ratio \(m_{s}/m_{ud}\) can also be extracted from the masses of the neutral Nambu–Goldstone bosons: neglecting effects of order \((m_{u}-m_{d})^2\) also here, the leading-order formula reads \(m_{s}/m_{ud}\mathop {=}\limits ^{\mathrm{LO}}\frac{3}{2}\hat{M}_\eta ^2/\hat{M}_\pi ^2-\frac{1}{2}\). Numerically, this gives \(m_{s}/m_{ud}\mathop {=}\limits ^{\mathrm{LO}}24.2\). The relation has the advantage that the e.m. corrections are expected to be much smaller here, but it is more difficult to calculate the \(\eta \)-mass on the lattice. The comparison with (18) shows that, in this case, the contributions of NLO are somewhat larger: \(14\pm 2\) %.

3.4 Lattice determination of \(m_{u}\) and \(m_{d}\)

The determination of \(m_{u}\) and \(m_{d}\) separately requires additional input. MILC 09A [37] uses the mass difference between \(K^0\) and \(K^+\), from which they subtract electromagnetic effects using Dashen’s theorem with corrections, as discussed in Sect. 3.1. The up- and down- sea-quarks remain degenerate in their calculation, fixed to the value of \(m_{ud}\) obtained from \(M_{\pi ^0}\).

To determine \(m_{u}/m_{d}\), BMW 10A, 10B [22, 23] follow a slightly different strategy. They obtain this ratio from their result for \(m_{s}/m_{ud}\) combined with a phenomenological determination of the isospin-breaking quark-mass ratio \(Q=22.3(8)\), defined below in (24), from \(\eta \rightarrow 3\pi \) decays [30] (the decay \(\eta \rightarrow 3\pi \) is very sensitive to QCD isospin-breaking but fairly insensitive to QED isospin-breaking). As discussed in Sect. 3.5, the central value of the e.m. parameter \(\epsilon \) in (9) is taken from the same source.

RM123 11 [105] actually uses the e.m. parameter \(\epsilon =0.7(5)\) from the first edition of the FLAG review [1]. However, they estimate the effects of strong isospin-breaking at first non-trivial order, by inserting the operator \(\frac{1}{2}(m_{u}-m_{d})\int (\bar{u}u-\bar{d}d)\) into correlation functions, while performing the gauge averages in the isospin limit. Applying these techniques, they obtain \((\hat{M}_{K^0}^2-\hat{M}_{K^+}^2)/(m_{d}-m_{u})=2.57(8)\,\mathrm{MeV}\). Combining this result with the phenomenological \((\hat{M}_{K^0}^2-\hat{M}_{K^+}^2)=6.05(63)\times 10^3\) determined with the above value of \(\epsilon \), they get \((m_{d}-m_{u})=2.35(8)(24)\,\mathrm{MeV}\), where the first error corresponds to the lattice statistical and systematic uncertainties combined in quadrature, while the second arises from the uncertainty on \(\epsilon \). Note that below we quote results from RM123 11 for \(m_{u}\), \(m_{d}\) and \(m_{u}/m_{d}\). As described in Table 5, we obtain them by combining RM123 11’s result for \((m_{d}-m_{u})\) with ETM 10B’s result for \(m_{ud}\).

Table 5 Lattice results for \(m_{u}\), \(m_{d}\) (MeV) and for the ratio \(m_{u}/m_{d}\). The values refer to the \({\overline{\mathrm{MS}}}\) scheme at scale 2 GeV. The upper part of the table lists results obtained with \(N_\mathrm{f}=2+1\), while the lower part presents calculations with \(N_\mathrm{f} = 2\)

Instead of subtracting electromagnetic effects using phenomenology, RBC 07 [34] and Blum 10 [32] actually include a quenched electromagnetic field in their calculation. This means that their results include corrections to Dashen’s theorem, albeit only in the presence of quenched electromagnetism. Since the up- and down-quarks in the sea are treated as degenerate, very small isospin corrections are neglected, as in MILC’s calculation.

PACS-CS 12 [76] takes the inclusion of isospin-breaking effects one step further. Using reweighting techniques, it also includes electromagnetic and \(m_{u}-m_{d}\) effects in the sea.

Lattice results for \(m_{u}\), \(m_{d}\) and \(m_{u}/m_{d}\) are summarised in Table 5. In order to discuss them, we consider the LO formula

$$\begin{aligned} \frac{m_{u}}{m_{d}}\mathop {=}\limits ^{\mathrm{LO}}\frac{\hat{M}_{K^+}^2-\hat{M}_{K^0}^2+\hat{M}_{\pi ^+}^2}{\hat{M}_{K^0}^2-\hat{M}_{K^+}^2+\hat{M}_{\pi ^+}^2} .\end{aligned}$$
(19)

Using Eqs. (6)–(8) to express the meson masses in QCD in terms of the physical ones and linearizing in the corrections, this relation takes the form

$$\begin{aligned} \frac{m_{u}}{m_{d}}\mathop {=}\limits ^{\mathrm{LO}}0.558 - 0.084\, \epsilon - 0.02\, \epsilon _{\pi ^0} + 0.11\, \epsilon _{m} .\end{aligned}$$
(20)

Inserting the estimates (9) and adding errors in quadrature, the LO prediction becomes

$$\begin{aligned} \frac{m_{u}}{m_{d}}\mathop {=}\limits ^{\mathrm{LO}}0.50(3).\end{aligned}$$
(21)

Again, the quoted error exclusively accounts for the errors attached to the estimates (9) for the epsilons—contributions of non-leading order are ignored. The uncertainty in the leading-order prediction is dominated by the one in the coefficient \(\epsilon \), which specifies the difference between the meson squared-mass splittings generated by the e.m. interaction in the kaon and pion multiplets. The reduction in the error on this coefficient since the previous review [1] results in a reduction of a factor of a little less than 2 in the uncertainty on the LO value of \(m_{u}/m_{d}\) given in (21).

It is interesting to compare the assumptions made or results obtained by the different collaborations for the violation of Dashen’s theorem. The input used in MILC 09A is \(\epsilon =1.2(5)\) [37], while the \(N_\mathrm{f}=2\) computation of RM123 13 finds \(\epsilon =0.79(18)(18)\) [45]. As discussed in Sect. 3.5, the value of \(Q\) used by BMW 10A, 10B [22, 23] gives \(\epsilon =0.70(28)\) at NLO (see (31)). On the other hand, RBC 07 [34] and Blum 10 [32] obtain the results \(\epsilon =0.13(4)\) and \(\epsilon =0.5(1)\). Note that PACS-CS 12 [76] do not provide results which allow us to determine \(\epsilon \) directly. However, using their result for \(m_{u}/m_{d}\), together with (20), and neglecting NLO terms, one finds \(\epsilon =-1.6(6)\), which is difficult to reconcile with what is known from phenomenology (see Sects. 3.1 and 3.5). Since the values assumed or obtained for \(\epsilon \) differ, it does not come as a surprise that the determinations of \(m_{u}/m_{d}\) are different.

These values of \(\epsilon \) are also interesting because they allow us to estimate the chiral corrections to the LO prediction (21) for \(m_{u}/m_{d}\). Indeed, evaluating the relation (20) for the values of \(\epsilon \) given above, and neglecting all other corrections in this equation, yields the LO values \((m_{u}/m_{d})^{\mathrm {LO}}=0.46(4)\), 0.547(3), 0.52(1), 0.50(2), 0.49(2) for MILC 09A, RBC 07, Blum 10, BMW 10A, 10B and RM123 13, respectively. However, in comparing these numbers to the non-perturbative results of Table 5 one must be careful not to double count the uncertainty arising from \(\epsilon \). One way to obtain a sharp comparison is to consider the ratio of the results of Table 5 to the LO values \((m_{u}/m_{d})^\mathrm{LO}\), in which the uncertainty from \(\epsilon \) cancels to good accuracy. Here we will assume for simplicity that they cancel completely and will drop all uncertainties related to \(\epsilon \). For \(N_\mathrm{f} = 2\) we consider RM123 13 [45], which updates RM123 11 and has no red dots. Since the uncertainties common to \(\epsilon \) and \(m_{u}/m_{d}\) are not explicitly given in [45], we have to estimate them. For that we use the leading-order result for \(m_{u}/m_{d}\), computed with RM123 13’s value for \(\epsilon \). Its error bar is the contribution of the uncertainty on \(\epsilon \) to \((m_{u}/m_{d})^\mathrm{LO}\). To good approximation this contribution will be the same for the value of \(m_{u}/m_{d}\) computed in [45]. Thus, we subtract it in quadrature from RM123 13’s result in Table 5 and compute \((m_{u}/m_{d})/(m_{u}/m_{d})^\mathrm{LO}\), dropping uncertainties related to \(\epsilon \). We find \((m_{u}/m_{d})/(m_{u}/m_{d})^\mathrm{LO} = 1.02(6)\). This result suggests that chiral corrections in the case of \(N_\mathrm{f}=2\) are negligible. For the two most accurate \(N_\mathrm{f}=2+1\) calculations, those of MILC 09A and BMW 10A, 10B, this ratio of ratios is 0.94(2) and 0.90(1), respectively. Though these two numbers are not fully consistent within our rough estimate of the errors, they indicate that higher-order corrections to (21) are negative and about 8 % when \(N_\mathrm{f}=2+1\). In the following, we will take them to be \(-\)8(4) %. The fact that these corrections are seemingly larger and of opposite sign than in the \(N_\mathrm{f}=2\) case is not understood at this point. It could be an effect associated with the quenching of the strange quark. It could also be due to the fact that the RM123 13 calculation does not probe deeply enough into the chiral regime—it has \(M_\pi \gtrsim 270\,\mathrm{MeV}\)—to pick up on important chiral corrections. Of course, being less than a two standard deviation effect, it may be that there is no problem at all and that differences from the LO result are actually small.

Given the exploratory nature of the RBC 07 calculation, its results do not allow us to draw solid conclusions about the e.m. contributions to \(m_{u}/m_{d}\) for \(N_\mathrm{f}=2\). As discussed in Sect. 3.3.2, the \(N_\mathrm{f}=2+1\) results of Blum 10 and PACS-CS 12 do not pass our selection criteria either. We therefore resort to the phenomenological estimates of the electromagnetic self-energies discussed in Sect. 3.1, which are validated by recent, preliminary lattice results.

Since RM123 13 [45] includes a lattice estimate of e.m. corrections, for the \(N_\mathrm{f}=2\) final results we simply quote the values of \(m_{u}\), \(m_{d}\) and \(m_{u}/m_{d}\) from RM123 13 given in Table 5:

$$\begin{aligned} N_\mathrm{f}&= 2: m_{u} =2.40(23)\,\mathrm{MeV},\quad m_{d} = 4.80(23) \,\mathrm{MeV},\nonumber \\&\frac{m_{u}}{m_{d}} = 0.50(4), \end{aligned}$$
(22)

with errors of roughly 10, 5 and 8 %, respectively. In these results, the errors are obtained by combining the lattice statistical and systematic errors in quadrature.

For \(N_\mathrm{f}=2+1\) there is to date no final, published computation of e.m. corrections. Thus, we take the LO estimate for \(m_{u}/m_{d}\) of (21) and use the \(-\)8(4) % obtained above as an estimate of the size of the corrections from higher orders in the chiral expansion. This gives \(m_{u}/m_{d}=0.46(3)\). The two individual masses can then be worked out from the estimate (14) for their mean. Therefore, for \(N_\mathrm{f}=2+1\) we obtain

$$\begin{aligned}&N_\mathrm{f}= 2+1: m_{u} =2.16(9)(7)\,\mathrm{MeV},\nonumber \\&\quad m_{d} = 4.68(14)(7) \,\mathrm{MeV},\frac{m_{u}}{m_{d}} = 0.46(2)(2).\end{aligned}$$
(23)

In these results, the first error represents the lattice statistical and systematic errors, combined in quadrature, while the second arises from the uncertainties associated with e.m. corrections of (9). The estimates in (23) have uncertainties of order 5, 3 and 7 %, respectively.

Naively propagating errors to the end, we obtain \((m_{u}/m_{d})_{N_\mathrm{f}=2}/(m_{u}/m_{d})_{N_\mathrm{f}=2+1}=1.09(10)\). If instead of (22) we use the results from RM123 11, modified by the e.m. corrections in (9), as was done in our previous review, we obtain \((m_{u}/m_{d})_{N_\mathrm{f}=2}/(m_{u}/m_{d})_{N_\mathrm{f}=2+1}=1.11(7)(1)\), confirming again the strong cancellation of e.m. uncertainties in the ratio. The \(N_\mathrm{f}=2\) and \(2+1\) results are compatible at the 1 to 1.5 \(\sigma \) level.

It is interesting to note that in the results above, the errors are no longer dominated by the uncertainties in the input used for the electromagnetic corrections, though these are still significant at the level of precision reached in the \(N_\mathrm{f}=2+1\) results. This is due to the reduction in the error on \(\epsilon \) discussed in Sect. 3.1. Nevertheless, the comparison of Eqs. (21) and (23) indicates that more than half of the difference between the prediction \(m_{u}/m_{d}=0.558\) obtained from Weinberg’s mass formulae [100] and the result for \(m_{u}/m_{d}\) obtained on the lattice stems from electromagnetism, the higher orders in the chiral perturbation generating a comparable correction.

In view of the fact that a massless up-quark would solve the strong CP-problem, many authors have considered this an attractive possibility, but the results presented above exclude this possibility: the value of \(m_{u}\) in (23) differs from zero by 20 standard deviations. We conclude that nature solves the strong CP-problem differently. This conclusion relies on lattice calculations of kaon masses and on the phenomenological estimates of the e.m. self-energies discussed in Sect. 3.1. The uncertainties therein currently represent the limiting factor in determinations of \(m_{u}\) and \(m_{d}\). As demonstrated in [3234, 4044, 50], lattice methods can be used to calculate the e.m. self-energies. Further progress on the determination of the light-quark masses hinges on an improved understanding of the e.m. effects.

3.5 Estimates for \(R\) and \(Q\)

The quark-mass ratios

$$\begin{aligned} R\equiv \frac{m_{s}-m_{ud}}{m_{d}-m_{u}} \quad \hbox {and}\quad Q^2\equiv \frac{m_{s}^2-m_{ud}^2}{m_{d}^2-m_{u}^2} \end{aligned}$$
(24)

compare SU(3)-breaking with isospin-breaking. The quantity \(Q\) is of particular interest because of a low-energy theorem [106], which relates it to a ratio of meson masses,

$$\begin{aligned}&Q^2_M\equiv \frac{\hat{M}_K^2}{\hat{M}_\pi ^2}\cdot \frac{\hat{M}_K^2-\hat{M}_\pi ^2}{\hat{M}_{K^0}^2-\hat{M}_{K^+}^2} ,\quad \hat{M}^2_\pi \equiv \frac{1}{2}\left( \hat{M}^2_{\pi ^+}+ \hat{M}^2_{\pi ^0}\right) ,\nonumber \\&\quad \hat{M}^2_K\equiv \frac{1}{2}\left( \hat{M}^2_{K^+}+\hat{M}^2_{K^0}\right) .\end{aligned}$$
(25)

Chiral symmetry implies that the expansion of \(Q_M^2\) in powers of the quark masses (i) starts with \(Q^2\) and (ii) does not receive any contributions at NLO:

$$\begin{aligned} Q_M\mathop {=}\limits ^{\mathrm{NLO}}Q .\end{aligned}$$
(26)

Inserting the estimates for the mass ratios \(m_{s}/m_{ud}\) and \(m_{u}/m_{d}\) given for \(N_\mathrm{f}=2\) in Eqs. (13) and (22), respectively, we obtain

$$\begin{aligned} R=40.7(3.7)(2.2),\quad Q=24.3(1.4)(0.6) , \end{aligned}$$
(27)

where the errors have been propagated naively and the e.m. uncertainty has been separated out, as discussed in the third paragraph after (21). Thus, the meaning of the errors is the same as in (23). These numbers agree within errors with those reported in [45] where values for \(m_{s}\) and \(m_{ud}\) are taken from ETM 10B [60].

For \(N_\mathrm{f}=2+1\), we use Eqs. (18) and (23) and obtain

$$\begin{aligned} R=35.8(1.9)(1.8),\quad Q=22.6(7)(6), \end{aligned}$$
(28)

where the meaning of the errors is the same as above. The \(N_\mathrm{f}=2\) and \(N_\mathrm{f}=2+1\) results are compatible within 2\(\sigma \), even taking the correlations between e.m. effects into account.

It is interesting to use these results to study the size of chiral corrections in the relations of \(R\) and \(Q\) to their expressions in terms of meson masses. To investigate this issue, we use \(\chi \)PT to express the quark-mass ratios in terms of the pion and kaon masses in QCD and then again use Eqs. (6)–(8) to relate the QCD masses to the physical ones. Linearizing in the corrections, this leads to

$$\begin{aligned}&R \mathop {=}\limits ^{\mathrm{LO}}R_M \!=\! 43.9 - 10.8\, \epsilon \!+\! 0.2\, \epsilon _{\pi ^0} \!-\! 0.2\, \epsilon _{K^0}\!-\! 10.7\, \epsilon _{m},\nonumber \\\end{aligned}$$
(29)
$$\begin{aligned}&Q \mathop {=}\limits ^{\mathrm{NLO}}Q_M \!=\! 24.3 \!-\! 3.0\, \epsilon \!+\! 0.9\, \epsilon _{\pi ^0} \!-\! 0.1\, \epsilon _{K^0} \!+\! 2.6 \,\epsilon _{m} .\qquad \end{aligned}$$
(30)

While the first relation only holds to LO of the chiral perturbation series, the second remains valid at NLO, on account of the low energy theorem mentioned above. The first terms on the right hand side represent the values of \(R\) and \(Q\) obtained with the Weinberg leading-order formulae for the quark-mass ratios [100]. Inserting the estimates (9), we find that the e.m. corrections lower the Weinberg values to \(R_M= 36.7(3.3)\) and \(Q_M= 22.3(9)\), respectively.

Comparison of \(R_M\) and \(Q_M\) with the full results quoted above gives a handle on higher-order terms in the chiral expansion. Indeed, the ratios \(R_M/R\) and \(Q_M/Q\) give NLO and NNLO (and higher) corrections to the relations \(R \mathop {=}\limits ^{\mathrm{LO}}R_M\) and \(Q\mathop {=}\limits ^{\mathrm{NLO}}Q_M\), respectively. The uncertainties due to the use of the e.m. corrections of (9) are highly correlated in the numerators and denominators of these ratios, and we make the simplifying assumption that they cancel in the ratio. Thus, for \(N_\mathrm{f}=2\) we evaluate (29) and (30) using \(\epsilon =0.79(18)(18)\) from RM123 13 [45] and the other corrections from (9), dropping all uncertainties. We divide them by the results for \(R\) and \(Q\) in (27), omitting the uncertainties due to e.m. We obtain \(R_M/R\simeq 0.88(8)\) and \(Q_M/Q\simeq 0.91(5)\). We proceed analogously for \(N_\mathrm{f}=2+1\), using \(\epsilon =0.70(3)\) from (9) and \(R\) and \(Q\) from (28), and find \(R_M/R\simeq 1.02(5)\) and \(Q_M/Q\simeq 0.99(3)\). The chiral corrections appear to be small for \(N_\mathrm{f}=2+1\), especially those in the relation of \(Q\) to \(Q_M\). This is less true for \(N_\mathrm{f}=2\), where the NNLO and higher corrections to \(Q=Q_M\) could be significant. However, as for other quantities which depend on \(m_{u}/m_{d}\), this difference is not significant.

As mentioned in Sect. 3.1, there is a phenomenological determination of \(Q\) based on the decay \(\eta \rightarrow 3\pi \) [107, 108]. The key point is that the transition \(\eta \rightarrow 3\pi \) violates isospin-conservation. The dominating contribution to the transition amplitude stems from the mass difference \(m_{u}-m_{d}\). At NLO of \(\chi \)PT, the QCD part of the amplitude can be expressed in a parameter-free manner in terms of \(Q\). It is well-known that the electromagnetic contributions to the transition amplitude are suppressed (a thorough recent analysis is given in [109]). This implies that the result for \(Q\) is less sensitive to the electromagnetic uncertainties than the value obtained from the masses of the Nambu–Goldstone bosons. For a recent update of this determination and for further references to the literature, we refer to [110]. Using dispersion theory to pin down the momentum-dependence of the amplitude, the observed decay rate implies \(Q=22.3(8)\) (since the uncertainty quoted in [110] does not include an estimate for all sources of error, we have retained the error estimate given in [104], which is twice as large). The formulae for the corrections of NNLO are available also in this case [111]—the poor knowledge of the effective coupling constants, particularly of those that are relevant for the dependence on the quark masses, is currently the limiting factor encountered in the application of these formulae.

As was to be expected, the central value of \(Q\) obtained from \(\eta \)-decay agrees exactly with the central value obtained from the low-energy theorem: we have used that theorem to estimate the coefficient \(\epsilon \), which dominates the e.m. corrections. Using the numbers for \(\epsilon _{m}\), \(\epsilon _{\pi ^0}\) and \(\epsilon _{K^0}\) in (9) and adding the corresponding uncertainties in quadrature to those in the phenomenological result for \(Q\), we obtain

$$\begin{aligned} \epsilon \mathop {=}\limits ^{\mathrm{NLO}}0.70(28).\end{aligned}$$
(31)

The estimate (9) for the size of the coefficient \(\epsilon \) is taken from here, as it is confirmed by the most recent, preliminary lattice determinations [4045].

Our final results for the masses \(m_{u}\), \(m_{d}\), \(m_{ud}\), \(m_{s}\) and the mass ratios \(m_{u}/m_{d}\), \(m_{s}/m_{ud}\), \(R\), \(Q\) are collected in Tables 6 and 7. We separate \(m_{u}\), \(m_{d}\), \(m_{u}/m_{d}\), \(R\) and \(Q\) from \(m_{ud}\), \(m_{s}\) and \(m_{s}/m_{ud}\), because the latter are completely dominated by lattice results while the former still include some phenomenological input.

Table 6 Our estimates for the strange and the average up-down quark masses in the \({\overline{\mathrm{MS}}}\) scheme at running scale \(\mu =2\,\mathrm{GeV}\) for \(N_\mathrm{f}=3\). Numerical values are given in MeV. In the results presented here, the first error is the one which we obtain by applying the averaging procedure of Sect. 2.2 to the relevant lattice results. We have added an uncertainty to the \(N_\mathrm{f}=2+1\) results, which is associated with the neglect of heavy sea-quark and isospin-breaking effects, as discussed around (14) and (18). This uncertainty is not included in the \(N_\mathrm{f}=2\) results, as it should be smaller than the uncontrolled systematic associated with the neglect of strange sea-quark effects which we choose not to estimate, as it cannot be done so reliably
Table 7 Our estimates for the masses of the two lightest quarks and related, strong isospin-breaking ratios. Again, the masses refer to the \({\overline{\mathrm{MS}}}\) scheme at running scale \(\mu =2\,\mathrm{GeV}\) for \(N_\mathrm{f}=3\) and the numerical values are given in MeV. In the results presented here, the first error is the one that comes from lattice computations while the second for \(N_\mathrm{f}=2+1\) is associated with the phenomenological estimate of e.m. contributions, as discussed after (23). The second error on the \(N_\mathrm{f}=2\) results for \(R\) and \(Q\) is also an estimate of the e.m. uncertainty, this time associated with the lattice computation of [45], as explained after (27). We present these results in a separate table, because they are less firmly established than those in Table 6. For \(N_\mathrm{f}=2+1\) they still include information coming from phenomenology, in particular on e.m. corrections, and for \(N_\mathrm{f}=2\) the e.m. contributions are computed neglecting the feedback of sea-quarks on the photon field

4 Leptonic and semileptonic kaon and pion decay and \(|V_{ud}|\) and \(|V_{us}|\)

This section summarises state of the art lattice calculations of the leptonic kaon and pion decay constants and the kaon semileptonic decay form factor and provides an analysis in view of the Standard Model. With respect to the previous edition of the FLAG review [1] the data in this section have been updated, correlations of lattice data are now taken into account in all the analysis and a subsection on the individual decay constants \(f_K\) and \(f_\pi \) (rather than only the ratio) has been included. Furthermore, when combining lattice data with experimental results we now take into account the strong SU(2) isospin correction in chiral perturbation theory for the ratio of leptonic decay constants \(f_K/f_\pi \).

4.1 Experimental information concerning \(|V_{ud}|\), \(|V_{us}|\), \(f_+(0)\) and \( {f_{K^\pm }}/{f_{\pi ^\pm }}\)

The following review relies on the fact that precision experimental data on kaon decays very accurately determine the product \(|V_{us}|f_+(0)\) and the ratio \(|V_{us}/V_{ud}|f_{K^\pm }/f_{\pi ^\pm }\) [112]:

$$\begin{aligned} |V_{us}| f_+(0) = 0.2163(5),\quad \left| \frac{V_{us}}{V_{ud}}\right| \frac{ f_{K^\pm }}{ f_{\pi ^\pm }} =0.2758(5).\end{aligned}$$
(32)

Here and in the following \(f_{K^\pm }\) and \(f_{\pi ^\pm }\) are the isospin-broken decay constants, respectively, in QCD (the electromagnetic effects have already been subtracted in the experimental analysis using chiral perturbation theory). We will refer to the decay constants in the SU(2) isospin-symmetric limit as \(f_{K}\) and \(f_{\pi }\). \(|V_{ud}|\) and \(|V_{us}|\) are elements of the Cabibbo–Kobayashi–Maskawa matrix and \(f_+(t)\) represents one of the form factors relevant for the semileptonic decay \(K^0\rightarrow \pi ^-\ell \,\nu \), which depends on the momentum transfer \(t\) between the two mesons. What matters here is the value at \(t=0\): \(f_+(0)\equiv f_+^{K^0\pi ^-}(t)\,{}_{\;t\rightarrow 0}\). The pion and kaon decay constants are defined byFootnote 9

$$\begin{aligned} {\langle }0|\,\bar{d}\gamma _\mu \gamma _5 u|\pi ^+(p)\rangle =i p_\mu f_{\pi ^+},\quad {\langle }0|\,\bar{s}\gamma _\mu \gamma _5 u|K^+(p)\rangle =i p_\mu f_{K^+}.\end{aligned}$$

In this normalisation, \(f_{\pi ^\pm } \simeq 130\) MeV, \(f_{K^\pm }\simeq 155\) MeV.

The measurement of \(|V_{ud}|\) based on superallowed nuclear \(\beta \) transitions has now become remarkably precise. The result of the update of Hardy and Towner [115], which is based on 20 different superallowed transitions, readsFootnote 10

$$\begin{aligned} |V_{ud}| = 0.97425(22).\end{aligned}$$
(33)

The matrix element \(|V_{us}|\) can be determined from semiinclusive \(\tau \) decays [122125]. Separating the inclusive decay \(\tau \rightarrow \hbox {hadrons}+\nu \) into non-strange and strange final states, e.g. HFAG 12 [126] obtain

$$\begin{aligned} |V_{us}|=0.2173(22) .\end{aligned}$$
(34)

Maltman et al. [124, 127, 128] and Gamiz et al. [129, 130] arrive at very similar values.

In principle, \(\tau \) decay offers a clean measurement of \(|V_{us}|\), but a number of open issues yet remain to be clarified. In particular, the value of \(|V_{us}|\) as determined from inclusive \(\tau \) decays differs from the result one obtains from assuming three-flavour SM-unitarity by more than three standard deviations [126]. It is important to understand this apparent tension better. The most interesting possibility is that \(\tau \) decay involves new physics, but more work both on the theoretical (see e.g. [131134]) and experimental side is required.

The experimental results in Eq. (32) are for the semileptonic decay of a neutral kaon into a negatively charged pion and the charged pion and kaon leptonic decays, respectively, in QCD. In the case of the semileptonic decays the corrections for strong and electromagnetic isospin breaking in chiral perturbation theory at NLO have allowed for averaging the different experimentally measured isospin channels [112]. This is quite a convenient procedure as long as lattice QCD does not include strong or QED isospin-breaking effects. Lattice results for \(f_K/f_\pi \) are typically quoted for QCD with (squared) pion and kaon masses of \(M_\pi ^2=M_{\pi ^0}^2\) and \(M_K^2=\frac{1}{2} (M_{K^\pm }^2+M_{K^0}^2-M_{\pi ^\pm }^2+M_{\pi ^0}^2)\) for which the leading strong and electromagnetic isospin violations cancel. While progress is being made for including strong and electromagnetic isospin breaking in the simulations (e.g. [19, 86, 105, 135137]), for now contact to experimental results is made by correcting leading SU(2) isospin breaking guided by chiral perturbation theory.

In the following we will start by presenting the lattice results for isospin-symmetric QCD. For any Standard Model analysis based on these results we then utilise chiral perturbation theory to correct for the leading isospin-breaking effects.

4.2 Lattice results for \(f_+(0)\) and \(f_K/f_\pi \)

The traditional way of determining \(|V_{us}|\) relies on using theory for the value of \(f_+(0)\), invoking the Ademollo–Gatto theorem [150]. Since this theorem only holds to leading order of the expansion in powers of \(m_{u}\), \(m_{d}\) and \(m_{s}\), theoretical models are used to estimate the corrections. Lattice methods have now reached the stage where quantities like \(f_+(0)\) or \(f_K/f_\pi \) can be determined to good accuracy. As a consequence, the uncertainties inherent in the theoretical estimates for the higher-order effects in the value of \(f_+(0)\) do not represent a limiting factor any more and we shall therefore not invoke those estimates. Also, we will use the experimental results based on nuclear \(\beta \) decay and \(\tau \) decay exclusively for comparison—the main aim of the present review is to assess the information gathered with lattice methods and to use it for testing the consistency of the SM and its potential to provide constraints for its extensions.

The data base underlying the present review of the semileptonic form factor and the ratio of decay constants is listed in Tables 8 and 9. The properties of the lattice data play a crucial role for the conclusions to be drawn from these results: range of \(M_\pi \), size of \(L M_\pi \), continuum extrapolation, extrapolation in the quark masses, finite-size effects, etc. The key features of the various data sets are characterised by means of the colour code specified in Sect. 2.1. More detailed information on individual computations are compiled in Appendix B.2.

Table 8 Colour code for the data on \(f_+(0)\)
Table 9 Colour code for the data on the ratio of decay constants: \(f_K/f_\pi \) is the pure QCD SU(2)-symmetric ratio and \(f_{K^\pm }/f_{\pi ^\pm }\) is in pure QCD with the SU(2) isospin breaking applied after simulation

The quantity \(f_+(0)\) represents a matrix element of a strangeness changing null plane charge, \(f_+(0)\!=\!(K|Q^{us}|\pi )\). The vector charges obey the commutation relations of the Lie algebra of SU(3), in particular \([Q^{us},Q^{su}]=Q^{{uu}-\mathrm{ss}}\). This relation implies the sum rule \(\sum _n |(K|Q^{us}|n)|^2-\sum _n |(K|Q^{su}|n)|^2=1\). Since the contribution from the one-pion intermediate state to the first sum is given by \(f_+(0)^2\), the relation amounts to an exact representation for this quantity [151]:

$$\begin{aligned} f_+(0)^2=1-\sum _{n\ne \pi } |(K|Q^{us}|n)|^2+\sum _n |(K|Q^{su}|n)|^2.\end{aligned}$$
(35)

While the first sum on the right extends over non-strange intermediate states, the second runs over exotic states with strangeness \(\pm 2\) and is expected to be small compared to the first.

The expansion of \(f_+(0)\) in SU(3) chiral perturbation theory in powers of \(m_{u}\), \(m_{d}\) and \(m_{s}\) starts with \(f_+(0)=1+f_2+f_4+\cdots \,\) [56]. Since all of the low energy constants occurring in \(f_2\) can be expressed in terms of \(M_\pi \), \(M_K\), \(M_\eta \) and \(f_\pi \) [152], the NLO correction is known. In the language of the sum rule (35), \(f_2\) stems from non-strange intermediate states with three mesons. Like all other non-exotic intermediate states, it lowers the value of \(f_+(0)\): \(f_2=-0.023\) when using the experimental value of \(f_\pi \) as input. The corresponding expressions have also been derived in quenched or partially quenched (staggered) chiral perturbation theory [140, 153]. At the same order in the SU(2) expansion [154], \(f_+(0)\) is parameterised in terms of \(M_\pi \) and two a priori unknown parameters. The latter can be determined from the dependence of the lattice results on the masses of the quarks. Note that any calculation that relies on the \(\chi \)PT formula for \(f_2\) is subject to the uncertainties inherent in NLO results: instead of using the physical value of the pion decay constant \(f_\pi \), one may, for instance, work with the constant \(f_0\) that occurs in the effective Lagrangian and represents the value of \(f_\pi \) in the chiral limit. Although trading \(f_\pi \) for \(f_0\) in the expression for the NLO term affects the result only at NNLO, it may make a significant numerical difference in calculations where the latter are not explicitly accounted for (the lattice results concerning the value of the ratio \(f_\pi /f_0\) are reviewed in Sect. 5.2).

The lattice results shown in the left panel of Fig. 4 indicate that the higher-order contributions \(\Delta f\equiv f_+(0)-1-f_2\) are negative and thus amplify the effect generated by \(f_2\). This confirms the expectation that the exotic contributions are small. The entries in the lower part of the left panel represent various model estimates for \(f_4\). In [175] the symmetry-breaking effects are estimated in the framework of the quark model. The more recent calculations are more sophisticated, as they make use of the known explicit expression for the \(K_{\ell 3}\) form factors to NNLO in \(\chi \)PT [174, 176]. The corresponding formula for \(f_4\) accounts for the chiral logarithms occurring at NNLO and is not subject to the ambiguity mentioned above.Footnote 11 The numerical result, however, depends on the model used to estimate the low-energy constants occurring in \(f_4\) [171174]. The figure indicates that the most recent numbers obtained in this way correspond to a positive rather than a negative value for \(\Delta f\). We note that FNAL/MILC 12 [140] have made an attempt at determining some of the low-energy constants appearing in \(f_4\) from lattice data.

Fig. 4
figure 4

Comparison of lattice results (squares) for \(f_+(0)\) and \(f_K/ f_\pi \) with various model estimates based on \(\chi \)PT (blue circles). The black squares and grey bands indicate our estimates. The significance of the colours is explained in Sect. 2

4.3 Direct determination of \(f_+(0)\) and \(f_{K^\pm }/f_{\pi ^\pm }\)

All lattice results for the form factor and the ratio of decay constants that we summarise here (Tables 8, 9) have been computed in isospin-symmetric QCD. The reason for this unphysical parameter choice is that simulations of SU(2) isospin-breaking effects in lattice QCD, while ultimately the cleanest way for predicting these effects, are still rare and in their infancy [32, 33, 40, 43, 105, 136, 137]. In the meantime one relies either on chiral perturbation theory [36, 56] to estimate the correction to the isospin limit or one calculates the breaking at leading order in \((m_{u}-m_{d})\) in the valence quark sector by making a suitable choice of the physical point to which the lattice data are extrapolated. Aubin 08, MILC and Laiho 11 for example extrapolate their simulation results for the kaon decay constant to the physical value of the \(up\)-quark mass (the results for the pion decay constant are extrapolated to the value of the average light-quark mass \(\hat{m}\)). This then defines their prediction for \(f_{K^\pm }/f_{\pi ^\pm }\).

As long as the majority of collaborations present their final results in the isospin-symmetric limit (as we will see this comprises the majority of results which qualify for inclusion into a FLAG average) we prefer to provide the overview of world data in Fig. 4 in this limit.

To this end we compute the isospin-symmetric ratio \(f_{K}/f_{\pi }\) for Aubin 08, MILC and Laiho 11 using NLO chiral perturbation theory [56, 177] where

$$\begin{aligned} \frac{f_K}{f_\pi }=\frac{1}{\sqrt{\delta _\mathrm{SU}(2)+1}} \frac{f_{K^\pm }}{f_{\pi ^\pm }}, \end{aligned}$$
(36)

and where [177],

$$\begin{aligned} \delta _\mathrm{SU(2)}&\approx \sqrt{3}\,\epsilon _\mathrm{SU(2)} \left[ -\frac{4}{3} \left( f_{K^\pm }/f_{\pi ^\pm }-1\right) \right. \nonumber \\&\left. +\,\frac{2}{3 (4\pi )^2 f_0^2} \left( M_K^2-M_\pi ^2-M_\pi ^2\ln \frac{M_K^2}{M_\pi ^2}\right) \right] . \end{aligned}$$
(37)

We use as input \(\epsilon _\mathrm{SU(2)}=\sqrt{3}/4/R\) with the FLAG result for \(R\) of Eq. (28), \(F_0=f_0/\sqrt{2}=80(20)\) MeV, \(M_\pi =135\) MeV and \(M_K=495\) MeV (we decided to choose a conservative uncertainty on \(f_0\) in order to reflect the magnitude of potential higher-order corrections) and obtain for example

 

\(f_{K^\pm }/f_{\pi ^\pm }\)

\(\delta _\mathrm{SU(2)}\)

\(f_K/f_\pi \)

Aubin 08

1.202(11)(9)(2)(5)

\(-\)0.0044(8)

1.205(11)(2)(9)(2)(5)

MILC 10

1.197(2)(\(^{+3}_{-7}\))

\(-\)0.0043(7)

1.200(2)(2)(\(^{+3}_{-7}\))

Laiho 11

1.191(16)(17)

\(-\)0.0041(9)

1.193(16)(2)(17)

(and similarly also for all other \(N_\mathrm{f}=2+1\) and \(N_\mathrm{f}=2+1+1\) results where applicable). In the last column the first error is statistical and the second is the one from the isospin correction (the remaining errors are quoted in the same order as in the original data). For \(N_\mathrm{f}=2\) a dedicated study of the strong-isospin correction in lattice QCD does exist. The result of the RM123 collaboration [105] amounts to \(\delta _\mathrm{SU(2)}=-0.0078(7)\) and we will later use this result for the correction in the case of \(N_\mathrm{f}=2\). We note that this value for the strong-isospin correction is incompatible with the above results based on SU(3) chiral perturbation theory. One would not expect the strange sea-quark contribution to be responsible for such a large effect. Whether higher-order effects in chiral perturbation theory or other sources are responsible still needs to be understood. To remain on the conservative side we attach the difference between the two- and three-flavour result as an additional uncertainty to the result based on chiral perturbation theory. For the further analysis we add both errors in quadrature.

The plots in Fig. 4 illustrate our compilation of data for \(f_+(0)\) and \(f_K/f_\pi \). In both cases the lattice data are largely consistent even when comparing simulations with different \(N_\mathrm{f}\). We now proceed to form the corresponding averages, separately for the data with \(N_\mathrm{f}=2+1+1\), \(N_\mathrm{f}=2+1\) and \(N_\mathrm{f}=2\) dynamical flavours and in the following will refer to these averages as the “direct” determinations.

For \(f_+(0)\) there are currently two computational strategies: FNAL/MILC 12 and FNAL/MILC 13 use the Ward identity relating the \(K\rightarrow \pi \) form factor at zero momentum transfer to the matrix element \({\langle }\pi |S|K\rangle \) of the flavour-changing scalar current. Peculiarities of the staggered fermion discretisation (see [140]) which FNAL/MILC is using makes this the favoured choice. The other collaborations are instead computing the vector-current matrix element \({\langle }\pi |V_\mu |K\rangle \). Apart from MILC 13C all simulations in Table 8 involve unphysically heavy quarks and therefore the lattice data need to be extrapolated to the physical pion and kaon masses corresponding to the \(K^0\rightarrow \pi ^-\) channel. We note that all state of the art computations of \(f_+(0)\) are using partially twisted boundary conditions which allow one to determine the form factor results directly at the relevant kinematical point \(q^2=0\) [178, 179].

The colour code in Table 8 shows that for \(f_+(0)\), presently only the result of ETM (we will be using ETM 09A [146]) with \(N_\mathrm{f}=2\) and the results by the FNAL/MILC and RBC/UKQCD collaborations with \(N_\mathrm{f}=2+1\) dynamical flavours of fermions, respectively, are without a red tag. The latter two results, \(f_+(0) =0.9670(20)(^{+18}_{-46})\) (RBC/UKQCD 13) and \(f_+(0) =0.9667(23)(33)\) (FNAL/MILC 12), agree very well. This is nice to observe given that the two collaborations are using different fermion discretisations (staggered fermions in the case of FNAL/MILC and domain-wall fermions in the case of RBC/UKQCD). Moreover, in the case of FNAL/MILC the form factor has been determined from the scalar-current matrix element while in the case of RBC/UKQCD it has been determined from the matrix element of the vector current. To a certain extent both simulations are expected to be affected by different systematic effects.

The result FNAL/MILC 12 is from simulations reaching down to a lightest RMS pion mass of about 380 MeV (the lightest valence pion mass for one of their ensembles is about 260 MeV). Their combined chiral and continuum extrapolation (results for two lattice spacings) is based on NLO staggered chiral perturbation theory supplemented by the continuum NNLO expression [174] and a phenomenological parameterisation of the breaking of the Ademollo–Gatto theorem at finite lattice spacing inherent in their approach. The \(p^4\) low-energy constants entering the NNLO expression have been fixed in terms of external input [57].

RBC/UKQCD 13 has analysed results on ensembles with pion masses down to 170MeV, mapping out nearly the complete range from the SU(3)-symmetric limit to the physical point. Although no finite volume or cutoff effects were observed in the simulation results, the expected residual systematic effects for finite-volume effects in NLO chiral perturbation theory and an order of magnitude estimate for cutoff effects were included into the overall error budget. The dominant systematic uncertainty is the one due to the extrapolation in the light quark mass to the physical point which RBC/UKQCD did with the help of a model motivated and partly based on chiral perturbation theory. The model dependence is estimated by comparing different ansätze for the mass extrapolation.

The ETM collaboration which uses the twisted-mass discretisation provides a comprehensive study of the systematics by presenting results for three lattice spacings [180] and simulating at light pion masses (down to \(M_\pi =260\) MeV). This allows one to constrain the chiral extrapolation, using both SU(3) [152] and SU(2) [154] chiral perturbation theory. Moreover, a rough estimate for the size of the effects due to quenching the strange quark is given, based on the comparison of the result for \(N_\mathrm{f}=2\) dynamical quark flavours [169] with the one in the quenched approximation, obtained earlier by the SPQcdR collaboration [181]. We note for completeness that ETM extrapolate their lattice results to the point corresponding to \(M_K^2\) and \(M_\pi ^2\) as defined at the end of Sect. 4.1. At the current level of precision though this is expected to be a tiny effect.

We now compute the \(N_\mathrm{f} =2+1\) FLAG-average for \(f_+(0)\) based on FNAL/MILC 13 and RBC/UKQCD 12, which we consider uncorrelated, and for \(N_\mathrm{f}=2\) the only result fulfilling the FLAG criteria is ETM 09A,

$$\begin{aligned} f_+(0)= 0.9661(32), \quad (\hbox {direct},\,N_\mathrm{f}=2+1), \nonumber \\ f_+(0)= 0.9560(57)(62), \quad (\hbox {direct},\,N_\mathrm{f}=2). \end{aligned}$$
(38)

The brackets in the second line indicate the statistical and systematic errors, respectively. The dominant source of systematic uncertainty in these simulations of \(f_+(0)\), the chiral extrapolation, will soon be removed by simulations with physical light quark masses (see FNAL/MILC 13C [138] and RBC/UKQCD [182])

In the case of the ratio of decay constants the data sets that meet the criteria formulated in the introduction are MILC 13A [157] and HPQCD 13A [156] with \(N_\mathrm{f}=2+1+1\), MILC 10 [159], BMW 10 [161], HPQCD/UKQCD 07 [165] and RBC/UKQCD 12 [25] (which is an update of RBC/UKQCD 10A [78]) with \(N_\mathrm{f}=2+1\) and ETM 09 [169] with \(N_\mathrm{f}=2\) dynamical flavours.

MILC 13A have determined the ratio of decay constants from a comprehensive set of ensembles of Highly Improved Staggered Quarks (HISQ) which have been taylored to reduce staggered taste-breaking effects. They have generated ensembles for four values of the lattice spacing (0.06–0.15 fm, scale set with \(f_\pi \)) and with the Goldstone pion masses approximately tuned to the physical point which at least on their finest lattice approximately agrees with the RMS pion mass (i.e. the difference in mass between different pion species which originates from staggered taste splitting). Supplementary simulations with slightly heavier Goldstone pion mass allow one to extract the ratio of decay constants for the physical value of the light-quark masses by means of polynomial interpolations. In a second step MILC extrapolates the data to the continuum limit where eventually the ratio \( {f_{K^\pm }}/{f_{\pi ^\pm }}\) is extracted. The final result of their analysis is \( {f_{K^\pm }}/{f_{\pi ^\pm }}=1.1947(26)(33)(17)(2)\) where the errors are statistical, due to the continuum extrapolation, due to finite volume effects and due to electromagnetic effects. MILC has found an increase in the central value of the ratio when going from the second finest to their finest ensemble and from this observation they derive the quoted 0.28 % uncertainty in the continuum extrapolation. They use NLO staggered chiral perturbation theory to correct for finite-volume effects and estimate the uncertainty in this approach by comparing to the alternative correction in NLO and NNLO continuum chiral perturbation theory. Although MILC and HPQCD are independent collaborations, MILC shares its gauge-field ensembles with HPQCD 13A, whose study of \( {f_{K^\pm }}/{f_{\pi ^\pm }}\) is therefore based on the same set of ensembles bar the one for the finest lattice spacing (\(a=\) 0.09–0.15 fm, scale set with \(f_{\pi ^+}\) and relative scale set with the Wilson flow [183, 184]) supplemented by some simulation points with heavier quark masses. HPQCD employed a global fit based on continuum NLO SU(3) chiral perturbation theory for the decay constants supplemented by a model for higher-order terms including discretisation and finite-volume effects (61 parameters for 39 data points supplemented by Bayesian priors). Their final result is \(f_{K^\pm }/f_{\pi ^\pm }=1.1916(15)(12)(1)(10)\), where the errors are statistical, due to the continuum extrapolation, due to finite-volume effects and the last error contains the combined uncertainties from the chiral extrapolation, the scale-setting uncertainty, the experimental input in terms of \(f_{\pi ^+}\) and from the uncertainty in \(m_{u}/m_{d}\).

Despite the large overlap in primary lattice data both collaborations arrive at surprisingly different error budgets. In the preparation of this report we interacted with both collaborations trying to understand the origin of the differences. HPQCD is using a rather new method to set the relative lattice scale for their ensembles which together with their more aggressive binning of the statistical samples, could explain the reduction in statistical error by a factor of 1.7 compared to MILC. Concerning the cutoff dependence, the finest lattice included into MILC’s analysis is \(a=0.06\) fm while the finest lattice in HPQCD’s case is \(a=0.09\) fm. MILC estimates the residual systematic after extrapolating to the continuum limit by taking the split between the result of an extrapolation with up to quartic and only up to quadratic terms in \(a\) as their systematic. HPQCD on the other hand models cutoff effects within their global fit ansatz up to including terms in \(a^8\). In this way HPQCD arrives at a systematic error due to the continuum limit which is smaller than MILC’s estimate by about a factor 2.8. HPQCD explainsFootnote 12 that in their setup, despite lacking the information from the fine ensemble (\(a=0.06\) fm), the approach to the continuum limit is reliably described by the chosen fit formula leaving no room for the shift in the result on the finest lattice observed by MILC. They further explain that their different way of setting the relative lattice scale leads to reduced cutoff effects compared to MILC’s study. We now turn to finite-volume effects which in the MILC result is the second-largest source of systematic uncertainty. NLO staggered chiral perturbation theory (MILC) or continuum chiral perturbation theory (HPQCD) was used for correcting the lattice data towards the infinite-volume limit. MILC then compared the finite-volume correction to the one obtained by the NNLO expression and took the difference as their estimate for the residual finite-volume error. In addition they checked the compatibility of the effective theory predictions (NLO continuum, staggered and NNLO continuum chiral perturbation theory) against lattice data of different spatial extent. The final verdict on the related residual systematic uncertainty on \( {f_{K^\pm }}/{f_{\pi ^\pm }}\) made by MILC is larger by an order of magnitude than the one made by HPQCD. We note that only HPQCD allows for taste-breaking terms in their fit model while MILC postpones such studies to future work.

The above comparison shows that MILC and HPQCD have studied similar sources of systematic uncertainties, e.g. by varying parts of the analysis procedure or by changing the functional form of a given fit ansatz. One observation worth mentioning in this context is the way in which the resulting variations in the fit result are treated. MILC tends to include the spread in central values from different ansätze into the systematic errors. HPQCD on the other hand determines the final result and attached errors from preferred fit-ansatz and then confirms that it agrees within errors with results from other ansätze without including the spreads into their error budget. In this way HPQCD is lifting the calculation of \( {f_{K^\pm }}/{f_{\pi ^\pm }}\) to a new level of precision. FLAG is looking forward to independent confirmations of the result for \( {f_{K^\pm }}/{f_{\pi ^\pm }}\) at the same level of precision. For now we will only provide a range for the result for \(N_\mathrm{f}=2+1+1\) that covers the result of both HPQCD 13A and MILC 13A,

$$\begin{aligned} {f_{K^\pm }}/{f_{\pi ^\pm }}\!=\!1.194(5)\quad (\hbox {our estimate, direct, }N_\mathrm{f}\!=\!2\!+\!1\!+\!1)\nonumber \\ \end{aligned}$$
(39)

Concerning simulations with \(N_\mathrm{f}\!=\!2+1\), MILC 10 and HPQCD/UKQCD 07 are based on staggered fermions, BMW 10 has used improved Wilson fermions and RBC/UKQCD 12’s result is based on the domain-wall formulation. For \(N_\mathrm{f}=2\) ETM has simulated twisted-mass fermions. In contrast to MILC 13A all these latter simulations are for unphysically heavy quark masses (corresponding to smallest pion masses in the range 240–260 MeV in the case of MILC 10, HPQCD/UKQCD 07 and ETM 09 and around 170 MeV for RBC/UKQCD 12) and therefore slightly more sophisticated extrapolations needed to be controlled. Various ansätze for the mass and cutoff dependence comprising SU(2) and SU(3) chiral perturbation theory or simply polynomials were used and compared in order to estimate the model dependence.

We now provide the FLAG average for these data. While BMW 10 and RBC/UKQCD 12 are entirely independent computations, subsets of the MILC gauge ensembles used by MILC 10 and HPQCD/UKQCD 07 are the same. MILC 10 is certainly based on a larger and more advanced set of gauge configurations than HPQCD/UQKCD 07. This allows them for a more reliable estimation of systematic effects. In this situation we consider only their statistical but not their systematic uncertainties to be correlated. For \(N_\mathrm{f}=2\) the FLAG average is just the result by ETM 09 and this is illustrated in terms of the vertical grey band in the r.h.s. panel of Fig. 4. For the purpose of this plot only, the isospin correction has been removed along the lines laid out earlier. For the average indicated in the case of \(N_\mathrm{f}=2+1\) we take the original data of BMW 10, HPQCD/UKQCD 07 and RBC/UKQCD 12 and use the MILC 10 result as computed above. The resulting fit is of good quality, with \(f_K/f_\pi =1.194(4)\) and \(\chi ^2/\hbox {dof}=0.4\). The systematic errors of the individual data sets are larger for MILC 10, BMW 10, HPQCD/UKQCD 07 and RBC/UKQCD 12, respectively, and following again the prescription of Sect. 2.3 we replace the error by the smallest one of these leading to \(f_K / f_\pi = 1.194(5)\) for \(N_\mathrm{f}=2+1\).

Before determining the average for \(f_{K^\pm }/f_{\pi ^\pm }\) which should be used for applications to Standard Model phenomenology we apply the isospin correction individually to all those results which have been published in the isospin-symmetric limit, i.e. BMW 10, HPQCD/UKQCD07 and RBC/UKQCD 12. To this end we invert Eq. (36) and use

$$\begin{aligned} \delta _\mathrm{SU(2)}&\approx \sqrt{3}\,\epsilon _\mathrm{SU(2)} \left[ -\frac{4}{3} (f_K/f_\pi -1)\right. \nonumber \\&\left. +\,\frac{2}{3(4\pi )^2 f_0^2}\left( M_K^2-M_\pi ^2- M_\pi ^2\ln \frac{M_K^2}{M_\pi ^2}\right) \right] . \end{aligned}$$
(40)

The results are:

 

\(f_K/f_\pi \)

\(\delta _\mathrm{SU(2)}\)

\(f_{K^\pm }/f_{\pi ^\pm }\)

HPQCD/UKQCD 07

1.189(2)(7)

\(-\)0.0040(7)

1.187(2) (2)(7)

BMW 10

1.192(7)(6)

\(-\)0.0041(7)

1.190(7) (2)(6)

RBC/UKQCD 12

1.199(12)(14)

\(-\)0.0043(9)

1.196(12)(2)(14)

As before, in the last column the first error is statistical and the second error is due to the isospin correction. Using these results we obtain

$$\begin{aligned} \begin{array}{rcll} f_{K^\pm } / f_{\pi ^\pm }&{}=&{} 1.192(5), \; &{}(\hbox {direct},\, N_\mathrm{f}=2+1),\\ f_{K^\pm } / f_{\pi ^\pm }&{}=&{} 1.205(6)(17), \; &{}(\hbox {direct},\, N_\mathrm{f}=2), \end{array} \end{aligned}$$
(41)

for QCD with broken isospin.

It is instructive to convert the above results for \(f_+(0)\) and \( {f_{K^\pm }}/{f_{\pi ^\pm }}\) into a corresponding range for the CKM matrix elements \(|V_{ud}|\) and \(|V_{us}|\), using the relations (32). Consider first the results for \(N_\mathrm{f}=2+1\). The range for \(f_+(0)\) in (38) is mapped into the interval \(|V_{us}|=0.2239(7)\), depicted as a horizontal green band in Fig. 5, while the one for \( {f_{K^\pm }}/{f_{\pi ^\pm }}\) in (41) is converted into \(|V_{us}|/|V_{ud}|= 0.2314(11)\), shown as a tilted green band. The smaller green ellipse is the intersection of these two bands.

Fig. 5
figure 5

The plot compares the information for \(|V_{ud}|\), \(|V_{us}|\) obtained on the lattice with the experimental result extracted from nuclear \(\beta \) transitions. The dotted arc indicates the correlation between \(|V_{ud}|\) and \(|V_{us}|\) that follows if the three-flavour CKM-matrix is unitary

More precisely, it represents the 68 % likelihood contour (note also that the ellipses shown in Fig. 5 of Ref. [1] have to be interpreted as 39 % likelihood contours), obtained by treating the above two results as independent measurements. Values of \(|V_{us}|\), \(|V_{ud}|\) in the region enclosed by this contour are consistent with the lattice data for \(N_\mathrm{f}=2+1\), within one standard deviation. In particular, the plot shows that the nuclear \(\beta \) decay result for \(|V_{ud}|\) is in good agreement with these data. We note that with respect to the previous edition of the FLAG review the reanalysis including new results has moved the ellipse representing QCD with \(N_\mathrm{f}=2+1\) slightly down and to the left.

Repeating the exercise for \(N_\mathrm{f}=2\) leads to the larger blue ellipse. The figure indicates a slight tension between the \(N_\mathrm{f}=2\) and \(N_\mathrm{f}=2+1\) results, which, at the current level of precision is not visible if considering the \(N_\mathrm{f}=2\) and \(N_\mathrm{f}=2+1\) results for \(f_+(0)\) and \( {f_{K^\pm }}/{f_{\pi ^\pm }}\) in Fig. 4 on their own. It remains to be seen if this is a first indication of the effect of quenching the strange quark.

In the case of \(N_\mathrm{f}=2+1+1\) only results for \( {f_{K^\pm }}/{f_{\pi ^\pm }}\) are without red tags. In this case we have therefore only plotted the corresponding band for \(|V_{us}|\) from \(f_{K^\pm }/f_{\pi ^\pm }\) corresponding to \(|V_{us}|/|V_{ud}|=0.2310(11)\).

4.4 Testing the Standard Model

In the Standard Model, the CKM matrix is unitary. In particular, the elements of the first row obey

$$\begin{aligned} |V_{u}|^2\equiv |V_{ud}|^2 + |V_{us}|^2 + |V_{ub}|^2 = 1.\end{aligned}$$
(42)

The tiny contribution from \(|V_{ub}|\) is known much better than needed in the present context: \(|V_{ub}|= 4.15 (49) \cdot 10^{-3}\) [74]. In the following, we first discuss the evidence for the validity of the relation (42) and only then use it to analyse the lattice data within the Standard Model.

In Fig. 5, the correlation between \(|V_{ud}|\) and \(|V_{us}|\) imposed by the unitarity of the CKM matrix is indicated by a dotted arc (more precisely, in view of the uncertainty in \(|V_{ub}|\), the correlation corresponds to a band of finite width, but the effect is too small to be seen here).

The plot shows that there is a slight tension with unitarity in the data for \(N_\mathrm{f} = 2 + 1\): Numerically, the outcome for the sum of the squares of the first row of the CKM matrix reads \(|V_{u}|^2 = 0.987(10)\). Still, it is fair to say that at this level the Standard Model passes a non-trivial test that exclusively involves lattice data and well-established kaon decay branching ratios. Combining the lattice results for \(f_+(0)\) and \( {f_{K^\pm }}/{f_{\pi ^\pm }}\) in (38) and (41) with the \(\beta \) decay value of \(|V_{ud}|\) quoted in (33), the test sharpens considerably: the lattice result for \(f_+(0)\) leads to \(|V_{u}|^2 = 0.9993(5)\), while the one for \( {f_{K^\pm }}/{f_{\pi ^\pm }}\) implies \(|V_{u}|^2 = 1.0000(6)\), thus confirming CKM unitarity at the permille level.

Repeating the analysis for \(N_\mathrm{f} = 2\), we find \(|V_{u}|^2 = 1.029(35)\) with the lattice data alone. This number is fully compatible with 1, in accordance with the fact that the dotted curve penetrates the blue contour. Taken by themselves, these results are perfectly consistent with the value of \(|V_{ud}|\) found in nuclear \(\beta \) decay: combining this value with the data on \(f_+(0)\) yields \(|V_{u}|^2=1.0004(10)\), combining it with the data on \( {f_{K^\pm }}/{f_{\pi ^\pm }}\) gives \(|V_{u}|^2= 0.9989(16)\). With respect to the first edition of the FLAG report the ellipse for \(N_\mathrm{f}=2\) has moved slightly to the left because we have now taken into account isospin-breaking effects.

For \(N_\mathrm{f}=2+1+1\) we can carry out the test of unitarity only with input from \( {f_{K^\pm }}/{f_{\pi ^\pm }}\) which leads to \(|V_{u}|^2=0.9998(7)\).

Note that the above tests also offer a check of the basic hypothesis that underlies our analysis: we are assuming that the weak interaction between the quarks and the leptons is governed by the same Fermi constant as the one that determines the strength of the weak interaction among the leptons and determines the lifetime of the muon. In certain modifications of the Standard Model, this is not the case. In those models it need not be true that the rates of the decays \(\pi \rightarrow \ell \nu \), \(K\rightarrow \ell \nu \) and \(K\rightarrow \pi \ell \nu \) can be used to determine the matrix elements \(|V_{ud}f_\pi |\), \(|V_{us}f_K|\) and \(|V_{us}f_+(0)|\), respectively and that \(|V_{ud}|\) can be measured in nuclear \(\beta \) decay. The fact that the lattice data are consistent with unitarity and with the value of \(|V_{ud}|\) found in nuclear \(\beta \) decay indirectly also checks the equality of the Fermi constants.

4.5 Analysis within the Standard Model

The Standard Model implies that the CKM matrix is unitary. The precise experimental constraints quoted in (32) and the unitarity condition (42) then reduce the four quantities \(|V_{ud}|,|V_{us}|,f_+(0), {f_{K^\pm }}/{f_{\pi ^\pm }}\) to a single unknown: any one of these determines the other three within narrow uncertainties.

Figure 6 shows that the results obtained for \(|V_{us}|\) and \(|V_{ud}|\) from the data on \( {f_{K^\pm }}/{f_{\pi ^\pm }}\) (squares) are quite consistent with the determinations via \(f_+(0)\) (triangles). In order to calculate the corresponding average values, we restrict ourselves to those determinations that we have considered best in Sect. 4.3. The corresponding results for \(|V_{us}|\) are listed in Table 10 (the error in the experimental numbers used to convert the values of \(f_+(0)\) and \( {f_{K^\pm }}/{f_{\pi ^\pm }}\) into values for \(|V_{us}|\) is included in the statistical error).

Table 10 Values of \(|V_{us}|\) obtained from lattice determinations of \(f_+(0)\) or \( {f_{K^\pm }}/{f_{\pi ^\pm }}\) with CKM unitarity. The first (second) number in brackets represents the statistical (systematic) error
Fig. 6
figure 6

Results for \(|V_{us}|\) and \(|V_{ud}|\) that follow from the lattice data for \(f_+(0)\) (triangles) and \( {f_{K^\pm }}/{f_{\pi ^\pm }}\) (squares), on the basis of the assumption that the CKM matrix is unitary. The black squares and the grey bands represent our estimates, obtained by combining these two different ways of measuring \(|V_{us}|\) and \(|V_{ud}|\) on a lattice. For comparison, the figure also indicates the results obtained if the data on nuclear \(\beta \) decay and \(\tau \) decay are analysed within the Standard Model

We consider the fact that the results from the five \(N_\mathrm{f}=2+1\) data sets FNAL/MILC 12 [140], RBC/UKQCD 13 [139], RBC/UKQCD 12 [25], BMW 10 [161], MILC 10 [159] and HPQCD/UKQCD 07 [165] are consistent with each other to be an important reliability test of the lattice work. Applying the prescription of Sect. 2.3, where we consider MILC 10, FNAL/MILC 12 and HPQCD/UKQCD 07 on the one hand and RBC/UKQCD 12 and RBC/UKQCD 13 on the other hand, as mutually statistically correlated since the analysis in the two cases starts from partly the same set of gauge ensembles, we arrive at \(|V_{us}| = 0.2247(7)\) with \(\chi ^2/\hbox {dof}=0.8\). This result is indicated on the left hand side of Fig. 6 by the narrow vertical band. The value for \(N_\mathrm{f}=2\), \(|V_{us}|= 0.2253(21)\), with \(\chi ^2/\hbox {dof}=0.9\), where we have considered ETM 09 and ETM 09A as statistically correlated is also indicated by a band. For \(N_\mathrm{f}=2+1+1\) we only consider the data for \( {f_{K^\pm }}/{f_{\pi ^\pm }}\) yielding \(|V_{us}|=0.2251(10)\). The figure shows that the result obtained for the data with \(N_\mathrm{f}=2\), \(N_\mathrm{f}=2+1\) and \(N_\mathrm{f}=2+1+1\) are perfectly consistent.

Alternatively, we can solve the relations for \(|V_{ud}|\) instead of \(|V_{us}|\). Again, the result \(|V_{ud}|=0.97434(22)\) which follows from the lattice data with \(N_\mathrm{f}=2+1+1\) is perfectly consistent with the values \(|V_{ud}|=0.97447(18)\) and \(|V_{ud}|=0.97427(49)\) obtained from those with \(N_\mathrm{f}=2+1\) and \(N_\mathrm{f}=2\), respectively. The reduction of the uncertainties in the result for \(|V_{ud}|\) due to CKM unitarity is to be expected from Fig. 5: the unitarity condition reduces the region allowed by the lattice results to a nearly vertical interval.

Next, we determine the value of \(f_+(0)\) that follows from the lattice data within the Standard Model. Using CKM unitarity to convert the lattice determinations of \( {f_{K^\pm }}/{f_{\pi ^\pm }}\) into corresponding values for \(f_+(0)\) and then combining these with the direct determinations of \(f_+(0)\), we find \(f_+(0)= 0.9634(32)\) from the data with \(N_\mathrm{f}=2+1\) and \(f_+(0)= 0.9595(90)\) for \(N_\mathrm{f}=2\). In the case \(N_\mathrm{f}=2+1+1\) we obtain \(f_+(0)=0.9611(47)\).

Finally, we work out the analogous Standard Model fits for \( {f_{K^\pm }}/{f_{\pi ^\pm }}\), converting the direct determinations of \(f_+(0)\) into corresponding values for \( {f_{K^\pm }}/{f_{\pi ^\pm }}\) and combining the outcome with the direct determinations of that quantity. The results read \( {f_{K^\pm }}/{f_{\pi ^\pm }}=1.197(4)\) for \(N_\mathrm{f}=2+1\) and \( {f_{K^\pm }}/{f_{\pi ^\pm }}= 1.192(12) \) for \(N_\mathrm{f}=2\), respectively.

The results obtained by analysing the lattice data in the framework of the Standard Model are collected in the upper half of Table 11. In the lower half of this table, we list the analogous results, found by working out the consequences of CKM unitarity for the experimental values of \(|V_{ud}|\) and \(|V_{us}|\) obtained from nuclear \(\beta \) decay and \(\tau \) decay, respectively. The comparison shows that the lattice result for \(|V_{ud}|\) not only agrees very well with the totally independent determination based on nuclear \(\beta \) transitions, but it is also remarkably precise. On the other hand, the values of \(|V_{ud}|\), \(f_+(0)\) and \( {f_{K^\pm }}/{f_{\pi ^\pm }}\) which follow from the \(\tau \) decay data if the Standard Model is assumed to be valid, are not in good agreement with the lattice results for these quantities. The disagreement is reduced considerably if the analysis of the \(\tau \) data is supplemented with experimental results on electroproduction [128]: the discrepancy then amounts to little more than one standard deviation.

Table 11 The upper half of the table shows our final results for \(|V_{us}|\), \(|V_{ud}|\), \(f_+(0)\) and \( {f_{K^\pm }}/{f_{\pi ^\pm }}\), which are obtained by analysing the lattice data within the Standard Model. For comparison, the lower half lists the values that follow if the lattice results are replaced by the experimental results on nuclear \(\beta \) decay and \(\tau \) decay, respectively

4.6 Direct determination of \(f_K\) and \(f_\pi \)

It is useful for flavour physics to provide not only the lattice average of \(f_K / f_\pi \), but also the average of the decay constant \(f_K\). Indeed, the \(\Delta S = 2\) hadronic matrix element for neutral kaon mixing is generally parameterised by \(M_K\), \(f_K\) and the kaon bag parameter \(B_K\). The knowledge of both \(f_K\) and \(B_K\) is therefore crucial for a precise theoretical determination of the CP-violation parameter \(\epsilon _K\) and for the constraint on the apex of the CKM unitarity triangle.

The case of the decay constant \(f_\pi \) is somehow different, since the experimental value of this quantity is often used for setting the scale in lattice QCD (see Appendix A.2). However, the physical scale can be set in different ways, namely by using as input the mass of the \(\Omega \)-baryon (\(m_\Omega \)) or the \(\Upsilon \)-meson spectrum (\(\Delta M_\Upsilon \)), which are less sensitive to the uncertainties of the chiral extrapolation in the light-quark mass with respect to \(f_\pi \). In such cases the value of the decay constant \(f_\pi \) becomes a direct prediction of the lattice QCD simulations. It is therefore interesting to provide also the average of the decay constant \(f_\pi \), obtained when the physical scale is set through another hadron observable, in order to check the consistency of different scale-setting procedures.

Our compilation of the values of \(f_\pi \) and \(f_K\) with the corresponding colour code is presented in Table 12. With respect to the case of \(f_K / f_\pi \) we have added two columns indicating which quantity is used to set the physical scale and the possible use of a renormalisation constant for the axial current. Indeed, for several lattice formulations the use of the non-singlet axial-vector Ward identity allows one to avoid the use of any renormalisation constant.

Table 12 Colour code for the lattice data on \(f_\pi \) and \(f_K\) together with information on the way the lattice spacing was converted to physical units and on whether or not an isospin-breaking correction has been applied (using chiral perturbation theory) to the quoted result. The numerical values are listed in MeV units

One can see that the determinations of \(f_\pi \) and \(f_K\) suffer from larger uncertainties with respect to the ones of the ratio \(f_K / f_\pi \), which is less sensitive to various systematic effects (including the uncertainty of a possible renormalisation constant) and, moreover, is not so exposed to the uncertainties of the procedure used to set the physical scale.

According to the FLAG rules three data sets can form the average of \(f_\pi \) and \(f_K\) for \(N_\mathrm{f} = 2 + 1\): RBC/UKQCD 12 [25] (update of RBC/UKQCD 10A), HPQCD/UKQCD 07 [165] and MILC 10 [159], which is the latest update of the MILC program.Footnote 13 We consider HPQCD/UKQCD 07 and MILC 10 as statistically correlated and use the prescription of Sect. 2.3 to form an average. For \(N_\mathrm{f} = 2\) the average cannot be formed for \(f_\pi \), and only one data set (ETM 09) satisfies the FLAG rules in the case of \(f_K\). Following the discussion around the \(N_\mathrm{f}=2+1+1\) result for \(f_{K^\pm }/f_{\pi ^\pm }\) we refrain from providing a FLAG-average for \(f_K\) for this case.

Thus, our estimates (in the isospin-symmetric limit of QCD) read

$$\begin{aligned} f_\pi&= 130.2 ~ (1.4) ~ \hbox {MeV} \qquad \qquad (N_\mathrm{f} = 2 + 1),\end{aligned}$$
(43)
$$\begin{aligned} f_K&= 156.3 ~ (0.9) ~ \hbox {MeV} \qquad \qquad (N_\mathrm{f} = 2 + 1), \nonumber \\ f_K&= 158.1 ~ (2.5) ~ \hbox {MeV} \qquad \qquad (N_\mathrm{f} = 2). \end{aligned}$$
(44)

The lattice results of Table 12 and our estimates (4344) are reported in Fig. 7. The latter ones compare positively within the errors with the latest experimental determinations of \(f_\pi \) and \(f_K\) from the PDG:

$$\begin{aligned} f_\pi ^\mathrm{(PDG)}&= 130.41 ~ (0.20) ~ \hbox {MeV},\nonumber \\&f_K ^{(PDG)}= 156.1 ~ (0.8) ~ \hbox {MeV}, \end{aligned}$$
(45)

which, we recall, do not correspond, however, to pure QCD results in the isospin-symmetric limit. Moreover, the values of \(f_\pi \) and \(f_K\) quoted by the PDG are obtained assuming Eq. (32) for the value of \(|V_{ud}|\) and adopting the RBC-UKQCD 07 result for \(f_+(0)\).

Fig. 7
figure 7

Values of \(f_\pi \) and \(f_K\). The black squares and grey bands indicate our estimates (43) and (44). The blue dots represent the experimental values quoted by the PDG (45)

5 Low-energy constants

In the study of the quark-mass dependence of QCD observables calculated on the lattice it is common practice to invoke Chiral Perturbation Theory (\(\chi \)PT). For a given quantity this framework predicts the non-analytic quark-mass dependence and it provides symmetry relations among different observables. These relations are best expressed with the help of a set of linearly independent and universal (i.e. process-independent) low-energy constants (LECs), which appear as coefficients of the polynomial terms (in \(m_{q}\) or \(M_\pi ^2\)) in different observables. If one expands around the SU(2) chiral limit, in the Chiral Effective Lagrangian there appear two LECs at order \(p^2\)

$$\begin{aligned}&F\equiv F_\pi \,\Big |_{m_{u},m_{d}\rightarrow 0}, \qquad B\equiv \frac{\Sigma }{F^2}\quad \hbox {where}\nonumber \\&\quad \Sigma \equiv -{\langle }\bar{u}u{\rangle }\,\Big |_{\;m_{u},m_{d}\rightarrow 0}, \end{aligned}$$
(46)

and seven at order \(p^4\), indicated by \(\bar{\ell }_i\) with \(i=1,\ldots ,7\). In the analysis of the SU(3) chiral limit there are also just two LECs at order \(p^2\)

$$\begin{aligned}&F_0\equiv F_\pi \,\Big |_{m_{u},m_{d},m_{s}\rightarrow 0} , \qquad B_0\equiv \frac{\Sigma _0}{F_0^2}\quad \nonumber \\&\quad \hbox {where}\quad \Sigma _0\equiv -{\langle }\bar{u}u{\rangle }\,\Big |_{m_{u},m_{d},m_{s}\rightarrow 0}, \end{aligned}$$
(47)

but ten at order \(p^4\), indicated by the capital letter \(L_i(\mu )\) with \(i=1,\ldots ,10\). These constants are independent of the quark massesFootnote 14, but they become scale dependent after renormalisation (sometimes a superscript \(r\) is added). The SU(2) constants \(\bar{\ell }_i\) are scale independent, since they are defined at \(\mu =M_\pi \) (as indicated by the bar). For the precise definition of these constants and their scale dependence we refer the reader to [56, 58].

First of all, lattice calculations can be used to test if chiral symmetry is indeed broken as SU\((N_\mathrm{f})_L \times \)SU\((N_\mathrm{f})_R \rightarrow \)SU\((N_\mathrm{f})_{L+R}\) by measuring non-zero chiral condensates and by verifying the validity of the GMOR relation \(M_\pi ^2\propto m\) close to the chiral limit. If the chiral extrapolation of quantities calculated on the lattice is made with the help of \(\chi \)PT, apart from determining the observable at the physical value of the quark masses one also obtains the relevant LECs. This is a very important by-product for two reasons:

  1. 1.

    All LECs up to order \(p^4\) (with the exception of \(B\) and \(B_0\), since only the product of these times the quark masses can be estimated from phenomenology) have either been determined by comparison to experiment or estimated theoretically. A lattice determination of the better known ones thus provides a test of the \(\chi \)PT approach.

  2. 2.

    The less well-known LECs are those which describe the quark-mass dependence of observables—these cannot be determined from experiment, and therefore the lattice provides unique quantitative information. This information is essential for improving phenomenological \(\chi \)PT predictions in which these LECs play a role.

We stress that this program is based on the non-obvious assumption that \(\chi \)PT is valid in the region of masses used in the lattice simulations under consideration.

The fact that, at large volume, the finite-size effects, which occur if a system undergoes spontaneous symmetry breakdown, are controlled by the Nambu–Goldstone modes, was first noted in solid-state physics, in connection with magnetic systems [187, 188]. As pointed out in [189] in the context of QCD, the thermal properties of such systems can be studied in a systematic and model-independent manner by means of the corresponding effective field theory, provided the temperature is low enough. While finite volumes are not of physical interest in particle physics, lattice simulations are necessarily carried out in a finite box. As shown in [190192], the ensuing finite-size effects can also be studied on the basis of the effective theory—\(\chi \)PT in the case of QCD—provided the simulation is close enough to the continuum limit, the volume is sufficiently large and the explicit breaking of chiral symmetry generated by the quark masses is sufficiently small. Indeed, \(\chi \)PT represents also a useful tool for the analysis of the finite-size effects in lattice simulations.

In the following two subsections we summarise the lattice results for the SU(2) and SU(3) LECs, respectively. In either case we first discuss the \(O(p^2)\) constants and then proceed to their \(O(p^4)\) counterparts. The \(O(p^2)\) LECs are determined from the chiral extrapolation of masses and decay constants or, alternatively, from a finite-size study of correlators in the \(\epsilon \)-regime. At order \(p^4\) some LECs affect two-point functions while other appear only in three- or four-point functions; the latter need to be determined from form factors or scattering amplitudes. The \(\chi \)PT analysis of the (non-lattice) phenomenological quantities is nowadaysFootnote 15 based on \(O(p^6)\) formulae. At this level the number of LECs explodes and we will not discuss any of these. We will, however, discuss how comparing different orders and different expansions (in particular \(x\) versus \(\xi \)-expansion; see below) can help to assess the theoretical uncertainties of the LECs determined on the lattice.

5.1 SU(2) low-energy constants

5.1.1 Quark-mass dependence of pseudoscalar masses and decay constants

The expansionsFootnote 16 of \(M_\pi ^2\) and \(F_\pi \) in powers of the quark mass are known to next-to-next-to-leading order in the SU(2) chiral effective theory. In the isospin limit, \(m_{u}=m_{d}=m\), the explicit expressions may be written in the form [193]

$$\begin{aligned} M_\pi ^2&= M^2 \left\{ 1-\frac{1}{2}x\ln \frac{\Lambda _3^2}{M^2} +\frac{17}{8}x^2 \left( \ln \frac{\Lambda _M^2}{M^2} \right) ^2 \right. \nonumber \\&\left. +x^2 k_M +O(x^3) \right\} ,\nonumber \\ F_\pi&= F \left\{ 1+x\ln \frac{\Lambda _4^2}{M^2} -\frac{5}{4}x^2 \left( \ln \frac{\Lambda _F^2}{M^2} \right) ^2 \right. \nonumber \\&\left. +x^2k_F +O(x^3) \right\} . \end{aligned}$$
(48)

Here the expansion parameter is given by

$$\begin{aligned} x=\frac{M^2}{(4\pi F)^2},\quad M^2=2Bm=\frac{2\Sigma m}{F^2}, \end{aligned}$$
(49)

but there is another option as discussed below. The scales \(\Lambda _3,\Lambda _4\) are related to the effective coupling constants \(\bar{\ell }_3,\bar{\ell }_4\) of the chiral Lagrangian at running scale \(M_\pi \equiv M_\pi ^\mathrm{phys}\) by

$$\begin{aligned} \bar{\ell }_n=\ln \frac{\Lambda _n^2}{M_\pi ^2},\quad n=1,\ldots ,7. \end{aligned}$$
(50)

Note that in Eq. (48) the logarithms are evaluated at \(M^2\), not at \(M_\pi ^2\). The coupling constants \(k_M,k_F\) in Eq. (48) are mass-independent. The scales of the squared logarithms can be expressed in terms of the \(O(p^4)\) coupling constants as

$$\begin{aligned} \ln \frac{\Lambda _M^2}{M^2}&= \frac{1}{51}\left( 28\ln \frac{\Lambda _1^2}{M^2} +32\ln \frac{\Lambda _2^2}{M^2} -9 \ln \frac{\Lambda _3^2}{M^2}+49\right) ,\nonumber \\ \ln \frac{\Lambda _F^2}{M^2}&= \frac{1}{30} \left( 14\ln \frac{\Lambda _1^2}{M^2} +16\ln \frac{\Lambda _2^2}{M^2}+6 \ln \frac{\Lambda _3^2}{M^2} \right. \nonumber \\&\left. - 6 \ln \frac{\Lambda _4^2}{M^2}+23 \right) . \end{aligned}$$
(51)

Hence by analysing the quark-mass dependence of \(M_\pi ^2\) and \(F_\pi \) with Eq. (48), possibly truncated at NLO, one can determineFootnote 17 the \(O(p^2)\) LECs \(B\) and \(F\), as well as the \(O(p^4)\) LECs \(\bar{\ell }_3\) and \(\bar{\ell }_4\). The quark condensate in the chiral limit is given by \(\Sigma =F^2B\). With precise enough data at several small enough pion masses, one could in principle also determine \(\Lambda _M\), \(\Lambda _F\) and \(k_M\), \(k_F\). To date this is not yet possible. The results for the LO and NLO constants will be presented in Sect. 5.1.6.

Alternatively, one can invert Eq. (48) and express \(M^2\) and \(F\) as an expansion in

$$\begin{aligned} \xi \equiv \frac{M_\pi ^2}{16 \pi ^2 F_\pi ^2}, \end{aligned}$$
(52)

and the corresponding expressions then take the form

$$\begin{aligned}&M^2= M_\pi ^2\nonumber \\&\quad \times \left\{ 1\!+\!\frac{1}{2}\,\xi \,\ln \frac{\Lambda _3^2}{M_\pi ^2}-\frac{5}{8}\,\xi ^2 \left( \!\ln \frac{\Omega _M^2}{M_\pi ^2}\!\right) ^2\!+\! \xi ^2 c_{\scriptscriptstyle M}+O(\xi ^3)\right\} ,\nonumber \\&F= F_\pi \nonumber \\&\quad \times \left\{ 1-\xi \,\ln \frac{\Lambda _4^2}{M_\pi ^2}-\frac{1}{4}\,\xi ^2\left( \!\ln \frac{\Omega _F^2}{M_\pi ^2}\!\right) ^2 +\xi ^2 c_{\scriptscriptstyle F}+O(\xi ^3)\right\} .\nonumber \\ \end{aligned}$$
(53)

The scales of the quadratic logarithms are determined by \(\Lambda _1,\ldots ,\Lambda _4\) through

$$\begin{aligned}&\ln \frac{\Omega _M^2}{M_\pi ^2}=\frac{1}{15}\nonumber \\&\quad \times \left( 28\,\ln \frac{\Lambda _1^2}{M_\pi ^2}\!+\!32\,\ln \frac{\Lambda _2^2}{M_\pi ^2}- 33\,\ln \frac{\Lambda _3^2}{M_\pi ^2}\!-\!12\,\ln \frac{\Lambda _4^2}{M_\pi ^2}+52\right) ,\nonumber \\&\quad \ln \frac{\Omega _F^2}{M_\pi ^2}=\frac{1}{3}\,\left( -7\,\ln \frac{\Lambda _1^2}{M_\pi ^2}-8\,\ln \frac{\Lambda _2^2}{M_\pi ^2}\!+\! 18\,\ln \frac{\Lambda _4^2}{M_\pi ^2}\!-\! \frac{29}{2}\right) .\nonumber \\ \end{aligned}$$
(54)

5.1.2 Two-point correlation functions in the epsilon-regime

The finite-size effects encountered in lattice calculations can be used to determine some of the LECs of QCD. In order to illustrate this point, we focus on the two lightest quarks, take the isospin limit \(m_{u}=m_{d}=m\) and consider a box of size \(L_{s}\) in the three space directions and size \(L_{t}\) in the time direction. If \(m\) is sent to zero at fixed box size, chiral symmetry is restored. The behaviour of the various observables in the symmetry-restoration region is controlled by the parameter \(\mu \equiv m\,\Sigma \,V\), where \(V=L_{s}^3L_{t}\) is the four-dimensional volume of the box. Up to a sign and a factor of two, the parameter \(\mu \) represents the minimum of the classical action that belongs to the leading-order effective Lagrangian of QCD.

For \(\mu \gg 1\), the system behaves qualitatively as if the box was infinitely large. In that region, the \(p\)-expansion, which counts \(1/L_{s}\), \(1/L_{t}\) and \(M\) as quantities of the same order, is adequate. In view of \(\mu =\frac{1}{2}F^2 M^2V \), this region includes configurations with \(ML\gtrsim \! 1\), where the finite-size effects due to pion loop diagrams are suppressed by the factor \(e^{-ML}\).

If \(\mu \) is comparable to or smaller than 1, however, the chiral perturbation series must be reordered. The \(\epsilon \)-expansion achieves this by counting \(1/L_{s}, 1/L_{t}\) as quantities of \(O(\epsilon )\), while the quark mass \(m\) is booked as a term of \(O(\epsilon ^4)\). This ensures that the symmetry-restoration parameter \(\mu \) represents a term of order \(O(\epsilon ^0)\), so that the manner in which chiral symmetry is restored can be worked out.

As an example, we consider the correlator of the axial charge carried by the two lightest quarks, \(q(x)\!=\!\{u(x),d(x)\}\). The axial current and the pseudoscalar density are given by

$$\begin{aligned} A_\mu ^i(x)= \bar{q}(x) \frac{1}{2}\tau ^i\,\gamma _\mu \gamma _5\,q(x),\ P^i(x) = \bar{q}(x) \frac{1}{2} \tau ^i\,{i} \gamma _5\,q(x),\nonumber \\ \end{aligned}$$
(55)

where \(\tau ^1, \tau ^2,\tau ^3\), are the Pauli matrices in flavour space. In Euclidean space, the correlators of the axial charge and of the space integral over the pseudoscalar density are given by

$$\begin{aligned} \delta ^{ik}C_{AA}(t) = L_{s}^3\int \hbox {d}^3 \vec {x}\;{\langle }A_4^i(\vec {x},t) A_4^k(0)\rangle , \nonumber \\ \delta ^{ik}C_{PP}(t) = L_{s}^3\int \hbox {d}^3 \vec {x}\;{\langle }P^i(\vec {x},t) P^k(0)\rangle . \end{aligned}$$
(56)

\(\chi \)PT yields explicit finite-size scaling formulae for these quantities [192, 194, 195]. In the \(\epsilon \)-regime, the expansion starts with

$$\begin{aligned} C_{AA}(t) = \frac{F^2L_{s}^3}{L_{t}}\left[ a_A+ \frac{L_{t}}{F^2L_{s}^3}\,b_A\,h_1 \left( \frac{t}{L_{t}} \right) +O(\epsilon ^4)\right] , \nonumber \\ C_{PP}(t) = \Sigma ^2L_{s}^6\left[ a_P+\frac{L_{t}}{F^2L_{s}^3}\,b_P\,h_1 \left( \frac{t}{L_{t}} \right) +O(\epsilon ^4)\right] ,\nonumber \\ \end{aligned}$$
(57)

where the coefficients \(a_A\), \(b_A\), \(a_P\), \(b_P\) stand for quantities of \(O(\epsilon ^0)\). They can be expressed in terms of the variables \(L_{s}\), \(L_{t}\) and \(m\) and involve only the two leading low-energy constants \(F\) and \(\Sigma \). In fact, at leading order only the combination \(\mu =m\,\Sigma \,L_{s}^3 L_{t}\) matters, the correlators are \(t\)-independent and the dependence on \(\mu \) is fully determined by the structure of the groups involved in the SSB pattern. In the case of SU(2) \(\times \) SU(2) \(\rightarrow \) SU(2), relevant for QCD in the symmetry-restoration region with two light quarks, the coefficients can be expressed in terms of Bessel functions. The \(t\)-dependence of the correlators starts showing up at \(O(\epsilon ^2)\), in the form of a parabola, viz. \(h_1(\tau )=\frac{1}{2}[(\tau -\frac{1}{2})^2-\frac{1}{12}]\). Explicit expressions for \(a_A\), \(b_A\), \(a_P\), \(b_P\) can be found in [192, 194, 195], where some of the correlation functions are worked out to NNLO. By matching the finite-size scaling of correlators computed on the lattice with these predictions one can extract \(F\) and \(\Sigma \). A way to deal with the numerical challenges genuine to the \(\epsilon \)-regime has been described [196].

The fact that the representation of the correlators to NLO is not “contaminated” by higher-order unknown LECs, makes the \(\epsilon \)-regime potentially convenient for a clean extraction of the LO couplings. The determination of these LECs is then affected by different systematic uncertainties with respect to the standard case; simulations in this regime yield complementary information which can serve as a valuable cross-check to get a comprehensive picture of the low-energy properties of QCD.

The effective theory can also be used to study the distribution of the topological charge in QCD [197] and the various quantities of interest may be defined for a fixed value of this charge. The expectation values and correlation functions then not only depend on the symmetry-restoration parameter \(\mu \), but also on the topological charge \(\nu \). The dependence on these two variables can explicitly be calculated. It turns out that the two-point correlation functions considered above retain the form (57), but the coefficients \(a_A\), \(b_A\), \(a_P\), \(b_P\) now depend on the topological charge as well as on the symmetry restoration parameter (see [198200] for explicit expressions).

A specific issue with \(\epsilon \)-regime calculations is the scale setting. Ideally one would perform a \(p\)-regime study with the same bare parameters to measure a hadronic scale (e.g. the proton mass). In the literature, sometimes a gluonic scale (e.g. \(r_0\)) is used to avoid such expenses. Obviously the issues inherent in scale setting are aggravated if the \(\epsilon \)-regime simulation is restricted to a fixed sector of topological charge.

It is important to stress that in the \(\epsilon \)-expansion higher-order finite-volume corrections might be significant, and the physical box size (in fm) should still be large in order to keep these contributions under control. The criteria for the chiral extrapolation and finite-volume effects are obviously different from the \(p\)-regime. For these reasons we have to adjust the colour coding defined in Sect. 2.1 (see Sect. 5.1.6 for more details).

Recently, the effective theory has been extended to the “mixed regime” where some quarks are in the \(p\)-regime and some in the \(\epsilon \)-regime [201, 202]. In [203] a technique is proposed to smoothly connect the \(p\)- and \(\epsilon \)-regimes. In [204] the issue is reconsidered with a counting rule which is essentially the same as in the \(p\)-regime. In this new scheme, the theory remains IR finite even in the chiral limit, while the chiral-logarithmic effects are kept present.

5.1.3 Energy levels of the QCD Hamiltonian in a box and \(\delta \)-regime

At low temperature, the properties of the partition function are governed by the lowest eigenvalues of the Hamiltonian. In the case of QCD, the lowest levels are due to the Nambu–Goldstone bosons and can be worked out with \(\chi \)PT [205]. In the chiral limit the level pattern follows the one of a quantum-mechanical rotator, i.e. \(E_\ell =\ell (\ell +1)/(2\,\Theta )\) with \(\ell = 0, 1,2,\ldots \). For a cubic spatial box and to leading order in the expansion in inverse powers of the box size \(L_{s}\), the moment of inertia is fixed by the value of the pion decay constant in the chiral limit, i.e. \(\Theta =F^2L_{s}^3\).

In order to analyse the dependence of the levels on the quark masses and on the parameters that specify the size of the box, a reordering of the chiral series is required, the so-called \(\delta \)-expansion; the region where the properties of the system are controlled by this expansion is referred to as the \(\delta \)-regime. Evaluating the chiral perturbation series in this regime, one finds that the expansion of the partition function goes in even inverse powers of \(FL_{s}\), that the rotator formula for the energy levels holds up to NNLO and the expression for the moment of inertia is now also known up to and including terms of order \((FL_{s})^{-4}\) [206208]. Since the level spectrum is governed by the value of the pion decay constant in the chiral limit, an evaluation of this spectrum on the lattice can be used to measure \(F\). More generally, the evaluation of various observables in the \(\delta \)-regime offers an alternative method for a determination of some of the low-energy constants occurring in the effective Lagrangian. At present, however, the numerical results obtained in this way [209, 210] are not yet competitive with those found in the \(p\)- or \(\epsilon \)-regimes.

5.1.4 Other methods for the extraction of the low-energy constants

An observable that can be used to extract the LECs is the topological susceptibility

$$\begin{aligned} \chi _{t}=\int \hbox {d}^4\!x\; {\langle }\omega (x) \omega (0)\rangle , \end{aligned}$$
(58)

where \(\omega (x)\) is the topological charge density,

$$\begin{aligned} \omega (x)=\frac{1}{32\pi ^2} \epsilon ^{\mu \nu \rho \sigma }\mathrm{Tr}\left[ F_{\mu \nu }(x)F_{\rho \sigma }(x)\right] . \end{aligned}$$
(59)

At infinite volume, the expansion of \(\chi _{t}\) in powers of the quark masses starts with [211]

$$\begin{aligned} \chi _{t}=\overline{m}\,\Sigma \,\{1\!+\!O(m)\}, \quad \overline{m}\equiv \left( \frac{1}{m_{u}}\!+\!\frac{1}{m_{d}}+\frac{1}{m_{s}}+\cdots \right) ^{-1}. \end{aligned}$$
(60)

The condensate \(\Sigma \) can thus be extracted from the properties of the topological susceptibility close to the chiral limit. The behaviour at finite volume, in particular in the region where the symmetry is restored, is discussed in [195]. The dependence on the vacuum angle \(\theta \) and the projection on sectors of fixed \(\nu \) have been studied in [197]. For a discussion of the finite-size effects at NLO, including the dependence on \(\theta \), we refer to [200, 212].

The role that the topological susceptibility plays in attempts to determine whether there is a large paramagnetic suppression when going from the \(N_\mathrm{f}=2\) to the \(N_\mathrm{f}=2+1\) theory has been highlighted in Ref. [213]. The potential usefulness of higher moments of the topological charge distribution to determine LECs has been investigated in [214].

Another method for computing the quark condensate has been proposed in [215], where it is shown that starting from the Banks–Casher relation [216] one may extract the condensate from suitable (renormalisable) spectral observables, for instance the number of Dirac operator modes in a given interval. For those spectral observables higher-order corrections can be systematically computed in terms of the chiral effective theory. A recent paper based on this strategy is ETM 13 [217]. As an aside let us remark that corrections to the Banks–Casher relation that come from a finite quark mass, a finite four-dimensional volume and (with Wilson-type fermions) a finite lattice spacing can be parameterised in a properly extended version of the chiral framework [218].

An alternative strategy is based on the fact that at LO in the \(\epsilon \)-expansion the partition function in a given topological sector \(\nu \) is equivalent to the one of a chiral Random Matrix Theory (RMT) [219222]. In RMT it is possible to extract the probability distributions of individual eigenvalues [223225] in terms of two dimensionless variables \(\zeta =\lambda \Sigma V\) and \(\mu =m\Sigma V\), where \(\lambda \) represents the eigenvalue of the massless Dirac operator and \(m\) is the sea quark mass. More recently this approach has been extended to the Hermitian (Wilson) Dirac operator [226] which is easier to study in numerical simulations. Hence, if it is possible to match the QCD low-lying spectrum of the Dirac operator to the RMT predictions, then one may extractFootnote 18 the chiral condensate \(\Sigma \). One issue with this method is that for the distributions of individual eigenvalues higher-order corrections are still not known in the effective theory, and this may introduce systematic effects which are hardFootnote 19 to control. Another open question is that, while it is clear how the spectral density is renormalised [230], this is not the case for the individual eigenvalues, and one relies on assumptions. There have been many lattice studies [231235] which investigate the matching of the low-lying Dirac spectrum with RMT. In this review the results of the LECs obtained in this wayFootnote 20 are not included.

5.1.5 Pion form factors

The scalar and vector form factors of the pion are defined by the matrix elements

$$\begin{aligned}&{\langle }\pi ^i(p_2) |\, \bar{q}\, q \, | \pi ^j(p_1) {\rangle } = \delta ^{ij} F_S^\pi (t) ,\\&{\langle }\pi ^i(p_2) | \,\bar{q}\, \frac{1}{2} \tau ^k \gamma ^\mu q\,| \pi ^j(p_1) {\rangle } = \hbox {i} \,\epsilon ^{ikj} (p_1^\mu + p_2^\mu ) F_V^\pi (t) ,\nonumber \end{aligned}$$
(61)

where the operators contain only the lightest two quark flavours, i.e. \(\tau ^1\), \(\tau ^2\), \(\tau ^3\) are the Pauli matrices, and \(t\equiv (p_1-p_2)^2\) denotes the momentum transfer.

The vector form factor has been measured by several experiments for timelike as well as for spacelike values of \(t\). The scalar form factor is not directly measurable, but it can be evaluated theoretically from data on the \(\pi \pi \) and \(\pi K\) phase shifts [236] by means of analyticity and unitarity, i.e. in a model-independent way. Lattice calculations can be compared with data or model-independent theoretical evaluations at any given value of \(t\). At present, however, most lattice studies concentrate on the region close to \(t=0\) and on the evaluation of the slope and curvature which are defined as

$$\begin{aligned} F^\pi _V(t)&= 1+{\frac{1}{6}}{\langle }r^2 \rangle ^\pi _V t + c_V t^2+\cdots \;,\\ F^\pi _S(t)&= F^\pi _S(0) \left[ 1+ {\frac{1}{6}}{\langle }r^2 \rangle ^\pi _S t + c_S\, t^2+ \cdots \right] . \nonumber \end{aligned}$$
(62)

The slopes are related to the mean-square vector and scalar radii which are the quantities on which most experiments and lattice calculations concentrate.

In chiral perturbation theory, the form factors are known at NNLO [237]. The corresponding formulae are available in fully analytical form and are compact enough that they can be used for the chiral extrapolation of the data (as done, for example in [238, 239]). The expressions for the scalar and vector radii and for the \(c_{S,V}\) coefficients at two-loop level read

$$\begin{aligned}&{\langle }r^2 \rangle ^\pi _S = \frac{1}{(4\pi F_\pi )^2}\nonumber \\&\times \left\{ 6 \ln \frac{\Lambda _4^2}{M_\pi ^2}-\frac{13}{2} \!-\!\frac{29}{3}\,\xi \left( \ln \frac{\Omega _{r_S}^2}{M_\pi ^2}\right) ^2+ 6 \xi \, k_{r_S}+O(\xi ^2)\right\} ,\nonumber \\&{\langle }r^2 \rangle ^\pi _V = \frac{1}{(4\pi F_\pi )^2}\nonumber \\&\times \left\{ \ln \frac{\Lambda _6^2}{M_\pi ^2}-1 +2\,\xi \left( \ln \frac{\Omega _{r_V}^2}{M_\pi ^2}\right) ^2+6 \xi \,k_{r_V}+O(\xi ^2)\right\} ,\nonumber \\&c_S =\frac{1}{(4\pi F_\pi M_\pi )^2} \left\{ \frac{19}{120} + \xi \left[ \frac{43}{36} \left( \! \ln \frac{\Omega _{c_S}^2}{M_\pi ^2} \!\right) ^2 + k_{c_S} \right] \right\} ,\nonumber \\&c_V =\frac{1}{(4\pi F_\pi M_\pi )^2} \left\{ \frac{1}{60}+\xi \left[ \frac{1}{72} \left( \ln \frac{\Omega _{c_V}^2}{M_\pi ^2} \right) ^2 + k_{c_V} \right] \right\} ,\nonumber \\ \end{aligned}$$
(63)

where

$$\begin{aligned} \ln \frac{\Omega _{r_S}^2}{M_\pi ^2}&= \frac{1}{29}\,\left( 31\,\ln \frac{\Lambda _1^2}{M_\pi ^2}+34\,\ln \frac{\Lambda _2^2}{M_\pi ^2}-36\,\ln \frac{\Lambda _4^2}{M_\pi ^2}+\frac{145}{24}\right) ,\nonumber \\ \ln \frac{\Omega _{r_V}^2}{M_\pi ^2}&= \frac{1}{2}\,\left( \ln \frac{\Lambda _1^2}{M_\pi ^2}-\ln \frac{\Lambda _2^2}{M_\pi ^2}+\ln \frac{\Lambda _4^2}{M_\pi ^2}+\ln \frac{\Lambda _6^2}{M_\pi ^2}-\frac{31}{12}\right) ,\nonumber \\ \ln \frac{\Omega _{c_S}^2}{M_\pi ^2}\!&= \!\frac{43}{63}\,\left( 11\,\ln \frac{\Lambda _1^2}{M_\pi ^2}\!+\!14\,\ln \frac{\Lambda _2^2}{M_\pi ^2}\!+\!18\,\ln \frac{\Lambda _4^2}{M_\pi ^2}\!-\!\frac{6041}{120}\right) ,\nonumber \\ \ln \frac{\Omega _{c_V}^2}{M_\pi ^2}&= \frac{1}{72}\,\left( 2\ln \frac{\Lambda _1^2}{M_\pi ^2}-2\ln \frac{\Lambda _2^2}{M_\pi ^2}-\ln \frac{\Lambda _6^2}{M_\pi ^2}-\frac{26}{30}\right) ,\end{aligned}$$
(64)

and \(k_{r_S},k_{r_V}\) and \(k_{c_S},k_{c_V}\) are independent of the quark masses. Their expression in terms of the \(\ell _i\) and of the \(O(p^6)\) constants \(c_M,c_F\) is known but will not be reproduced here.

The difference between the quark-line connected and the full (i.e. containing the connected and the disconnected piece) scalar pion form factor has been investigated by means of Chiral Perturbation Theory in [240]. It is expected that the technique used can be applied to a large class of observables relevant in QCD-phenomenology.

As a point of practical interest let us remark that there are no finite-volume correction formulae for the mean-square radii \({\langle }r^2{\rangle }_{V,S}\) and the curvatures \(c_{V,S}\). The lattice data for \(F_{V,S}(t)\) need to be corrected, point by point in \(t\), for finite-volume effects. In fact, if a given \(t\) is realised through several inequivalent \(p_1-p_2\) combinations, the level of agreement after the correction has been applied is indicative of how well higher-order effects are under control.

5.1.6 Lattice determinations

In this section we summarise the lattice results for the SU(2) couplings in a set of Tables 13, 14, 15, 16 and Figs. 8, 9, 10). The tables present our usual colour coding which summarises the main aspects related to the treatment of the systematic errors of the various calculations.

Table 13 Quark condensate \(\Sigma \equiv |\langle \bar{u}u\rangle |_{m_{u},m_{d}\rightarrow 0}\): colour code and numerical values in MeV (compare Fig. 8)
Table 14 Results for the leading-order SU(2) low-energy constant \(F\) (in MeV) and for the ratio \(F_\pi /F\). Numbers in slanted fonts have been calculated by us (see text for details). Horizontal lines establish the same grouping as in Table 13
Table 15 Results for the SU(2) NLO couplings \(\bar{\ell }_3\) and \(\bar{\ell }_4\). The MILC 10 results are obtained by converting the SU(3) LECs, while the MILC 10A results are obtained with a direct SU(2) fit. For comparison, the last two lines show results from phenomenological analyses
Table 16 Top panel: vector form factor of the pion. Lattice results for the charge radius \({\langle }r^2{\rangle }_V^\pi \) (in \({\mathrm {fm}}^2\)), the curvature \(c_V\) (in \({\mathrm {GeV}}^{-4}\)) and the effective coupling constant \(\bar{\ell }_6\) are compared with the experimental value obtained by NA7 and some phenomenological estimates. Bottom panel: scalar form factor of the pion. Lattice results for the scalar radius \({\langle } r^2 {\rangle }_S^\pi \) (in \({\mathrm {fm}}^2\)) and the combination \(\bar{\ell }_1-\bar{\ell }_2\) are compared with a dispersive calculation of these quantities [193]
Fig. 8
figure 8

Quark condensate \(\Sigma \equiv |\langle \bar{u}u\rangle |_{m_{u},m_{d}\rightarrow 0}\) (\(\overline{\mathrm{MS}}\)-scheme, scale \(\mu =2\) GeV). Squares and left triangles indicate determinations from correlators in the \(p\)- and \(\epsilon \)-regimes, respectively. Up triangles refer to extractions from the topological susceptibility, diamonds to determinations from the pion form factor, and star symbols refer to the spectral density method. The black squares and grey bands indicate our estimates. The meaning of the colours is explained in Sect. 2

Fig. 9
figure 9

Comparison of the results for the ratio of the physical pion decay constant \(F_\pi \) and the leading-order SU(2) low-energy constant \(F\). The meaning of the symbols is the same as in Fig. 8

Fig. 10
figure 10

Effective coupling constants \(\bar{\ell }_3\), \(\bar{\ell }_4\) and \(\bar{\ell }_6\). Squares indicate determinations from correlators in the \(p\)-regime, diamonds refer to determinations from the pion form factor

A delicate issue in the lattice determination of chiral LECs (in particular at NLO) which cannot be reflected by our colour coding is a reliable assessment of the theoretical error that comes from the chiral expansion. We add a few remarks on this point:

  1. 1.

    Using both the \(x\) and the \(\xi \) expansion is a good way to test how the ambiguity of the chiral expansion (at a given order) affects the numerical values of the LECs that are determined from a particular set of data. For instance, to determine \(\bar{\ell }_4\) (or \(\Lambda _4\)) from lattice data for \(F_\pi \) as a function of the quark mass, one may compare the fits based on the parameterisation \(F_\pi =F\{1+x\ln (\Lambda _4^2/M^2)\}\) [see Eq. (48)] with those obtained from \(F_\pi =F/\{1-\xi \ln (\Lambda _4^2/M_\pi ^2)\}\) [see Eq. (53)]. The difference between the two results provides an estimate of the uncertainty due to the truncation of the chiral series. Which central value one chooses is in principle arbitrary, but we find it advisable to use the one obtained with the \(\xi \) expansion,Footnote 21 in particular because it makes the comparison with phenomenological determinations (where it is standard practice to use the \(\xi \) expansion) more meaningful.

  2. 2.

    Alternatively one could try to estimate the influence of higher chiral orders by reshuffling irrelevant higher-order terms. For instance, in the example mentioned above one might use \(F_\pi =F/\{1-x\ln (\Lambda _4^2/M^2)\}\) as a different functional form at NLO. Another way to establish such an estimate is through introducing by hand “analytical” higher-order terms (e.g. “analytical NNLO” as done, in the past, by MILC [15]). In principle it would be preferable to include all NNLO terms or none, such that the structure of the chiral expansion is preserved at any order (this is what ETM [241] and JLQCD/TWQCD [67] have done for SU(2) \(\chi \)PT and MILC for SU(3) \(\chi \)PT [37]). There are different opinions in the field as to whether it is advisable to include terms to which the data are not sensitive. In case one is willing to include external (typically: non-lattice) information, the use of priors is a theoretically well-founded option (e.g. priors for NNLO LECs if one is interested in LECs at LO/NLO).

  3. 3.

    Another issue concerns the \(s\)-quark mass dependence of the LECs \(\bar{\ell }_i\) or \(\Lambda _i\) of the SU(2) framework. As far as variations of \(m_{s}\) around \(m_{s}^\mathrm{phys}\) are concerned (say for \(0<m_{s}<1.5m_{s}^\mathrm{phys}\) at best) the issue can be studied in SU(3) ChPT, and this has been done in a series of papers [56, 242, 243]. However, the effect of sending \(m_{s}\) to infinity, as is the case in \(N_\mathrm{f}=2\) lattice studies of SU(2) LECs, cannot be addressed in this way. A unique way to analyse this difference is to compare the numerical values of LECs determined in \(N_\mathrm{f}=2\) lattice simulations to those determined in \(N_\mathrm{f}=2+1\) lattice simulations (see e.g. [244] for a discussion).

  4. 4.

    Last but not least let us recall that the determination of the LECs is affected by discretisation effects, and it is important that these are removed by means of a continuum extrapolation. In this step invoking an extended version of the chiral Lagrangian [245247] may be usefulFootnote 22 in case one aims for a global fit of lattice data involving several \(M_\pi \) and \(a\) values and several chiral observables.

In the tables and figures we summarise the results of various lattice collaborations for the SU(2) LECs at LO (\(F\) or \(F/F_\pi \), \(B\) or \(\Sigma \)) and at NLO (\(\bar{\ell }_1-\bar{\ell }_2\), \(\bar{\ell }_3\), \(\bar{\ell }_4\), \(\bar{\ell }_5\), \(\bar{\ell }_6\)). Throughout we group the results into those which stem from \(N_\mathrm{f}=2+1+1\) calculations, those which come from \(N_\mathrm{f}=2+1\) calculations and those which stem from \(N_\mathrm{f}=2\) calculations (since, as mentioned above, the LECs are logically distinct even if the current precision of the data is not sufficient to resolve the differences). Furthermore, we make a distinction whether the results are obtained from simulations in the \(p\)-regime or whether alternative methods (\(\epsilon \)-regime, spectral quantities, topological susceptibility, etc.) have been used (this should not affect the result). For comparison we add, in each case, a few phenomenological determinations with high standing.

A generic comment applies to the issue of the scale setting. In the past none of the lattice studies with \(N_\mathrm{f}\ge 2\) involved simulations in the \(p\)-regime at the physical value of \(m_{ud}\). Accordingly, the setting of the scale \(a^{-1}\) via an experimentally measurable quantity did necessarily involve a chiral extrapolation, and as a result of this dimensionful quantities used to be particularly sensitive to this extrapolation uncertainty, while in dimensionless ratios such as \(F_\pi /F\), \(F/F_0\), \(B/B_0\), \(\Sigma /\Sigma _0\) this particular problem is much reduced (and often finite lattice-to-continuum renormalisation factors drop out). Now, there is a new generation of lattice studies [20, 22, 23, 140, 249, 250] which does involve simulations at physical pion masses. In such studies even the uncertainty that the scale setting has on dimensionful quantities is much mitigated.

It is worth repeating here that the standard colour-coding scheme of our tables is necessarily schematic and cannot do justice to every calculation. In particular there is some difficulty in coming up with a fair adjustment of the rating criteria to finite-volume regimes of QCD. For instance, in the \(\epsilon \)-regimeFootnote 23 we re-express the “chiral-extrapolation” criterion in terms of \(\sqrt{2m_\mathrm{min}\Sigma }/F\), with the same threshold values (in MeV) between the three categories as in the \(p\)-regime. Also the “infinite-volume” assessment is adapted to the \(\epsilon \)-regime, since the \(M_\pi L\) criterion does not make sense here; we assign a green star if at least two volumes with \(L>2.5\,{\mathrm {fm}}\) are included, an open symbol if at least one volume with \(L>2\,{\mathrm {fm}}\) is invoked and a red square if all boxes are smaller than \(2\,{\mathrm {fm}}\). Similarly, in the calculation of form factors and charge radii the tables do not reflect whether an interpolation to the desired \(q^2\) has been performed or whether the relevant \(q^2\) has been engineered by means of “partially twisted boundary conditions” [253]. In spite of these limitations we feel that these tables give an adequate overview of the qualities of the various calculations.

We begin with a discussion of the lattice results for the SU(2) LEC \(\Sigma \). We present the results in Table 13 and Fig. 8. We add that results which include only a statistical error are listed in the table but omitted from the plot. Regarding the \(N_\mathrm{f}=2\) computations there are five entries without a red tag (ETM 08, ETM 09C, ETM 12, ETM 13, Brandt 13). We form the average based on ETM 09C, ETM 13 (here we deviate from our “superseded” rule, since the latter work has a much bigger error) and Brandt 13. Regarding the \(N_\mathrm{f}=2+1\) computations there are three published papers (RBC/UKQCD 10A, MILC 10A and Borsanyi 12) which make it into the \(N_\mathrm{f}=2+1\) average and a preprint (BMW 13) which will be included in a future update. We also remark that among the three works included RBC/UKQCD 10A is inconsistent with the other two (MILC 10A and Borsanyi 12). For the time being we inflate the error of our \(N_\mathrm{f}=2+1\) average such that it includes all three central values it is based on. This yields

$$\begin{aligned} \Sigma \big |_{N_\mathrm{f}=2}=269(08)\,{\mathrm {MeV}},\quad \Sigma \big |_{N_\mathrm{f}=2+1}=271(15)\,{\mathrm {MeV}},\nonumber \\ \end{aligned}$$
(65)

where the errors include both statistical and systematic uncertainties. In accordance with our guidelines we plead with the reader to cite [217, 241, 257] (for \(N_\mathrm{f}=2\)) or [75, 78, 249] (for \(N_\mathrm{f}=2+1\)) when using these numbers. Finally, for \(N_\mathrm{f}=2+1+1\) there is only one calculation, and we recommend to use the result of [217] as given in Table 13. Another look at Fig. 8 confirms that these values are well consistent with each other.

The next quantity considered is \(F\), i.e. the pion decay constant in the SU(2) chiral limit (\(m_{ud}\rightarrow 0\) at fixed physical \(m_{s}\)) in the Bernese normalisation. As argued on previous occasions we tend to give preference to \(F_\pi /F\) (here the numerator is meant to refer to the physical-pion-mass point) wherever it is available, since often some of the systematic uncertainties are mitigated. We collect the results in Table 14 and Fig. 9. In those cases where the collaboration provides only \(F\), the ratio is computed on the basis of the phenomenological value of \(F_\pi \), and the corresponding entries in Table 14 are in slanted fonts. Among the \(N_\mathrm{f}=2\) determinations only three (ETM 08, ETM 09C and Brandt 13) are without red tags. Since the first two are by the same collaboration, only the latter two enter the average. Among the \(N_\mathrm{f}=2+1\) determinations three values (MILC 09A as an obvious update of MILC 09, NPLQCD 11 and Borsanyi 12) make it into the average. Finally, there is a single \(N_\mathrm{f}=2+1+1\) determination (ETM 10) which forms the current best estimate in this category.

Given this input our averaging procedure yields

$$\begin{aligned} \frac{F_\pi }{F}\big |_{N_\mathrm{f}=2}=1.0744(67),\quad \frac{F_\pi }{F}\big |_{N_\mathrm{f}=2+1}=1.0624(21), \end{aligned}$$
(66)

where the errors include both statistical and systematic uncertainties. We plead with the reader to cite [241, 257] (for \(N_\mathrm{f}=2\)) or [37, 249, 267] (for \(N_\mathrm{f}=2+1\)) when using these numbers. Finally, for \(N_\mathrm{f}=2+1+1\) we recommend to use the result of [98]; see Table 14 for the numerical value. From these numbers (or from a look at Fig. 9) it is obvious that the \(N_\mathrm{f}=2+1\) and \(N_\mathrm{f}=2+1+1\) results are not quite consistent. From a theoretical viewpoint this is rather surprising, since the only difference (the presence of absence of a dynamical charm quark) is expected to have a rather insignificant effect on this ratio (which, in addition, would be monotonic in \(N_\mathrm{f}\), contrary to what is seen in Fig. 9). In our view this indicates that—in spite of the conservative attitude taken in this report—the theoretical uncertainties in at least one of the two cases is likely underestimated. We hope that a future release of the FLAG report can clarify the issue.

We move on to a discussion of the lattice results for the NLO LECs \(\bar{\ell }_3\) and \(\bar{\ell }_4\). We remind the reader that on the lattice the former LEC is obtained as a result of the tiny deviation from linearity seen in \(M_\pi ^2\) versus \(Bm_{ud}\), whereas the latter LEC is extracted from the curvature in \(F_\pi \) versus \(Bm_{ud}\). The available determinations are presented in Table 15 and Fig. 10. Among the \(N_\mathrm{f}=2\) determinations ETM 08, ETM 09C and Brandt 13 are published and without red tags, and our rules imply that the latter two determinations enter our average. The colour coding of the \(N_\mathrm{f}=2+1\) results looks very promising; there is a significant number of lattice determinations without any red tag. At first sight it seems that RBC/UKQCD 10A, MILC 10A, NPLQCD 11, Borsanyi 12 and RBC/UKQCD 12 make it into the average. Unfortunately, \(\bar{\ell }_3\) and \(\bar{\ell }_4\) of RBC/UKQCD 10A have no systematic error; therefore we exclude this work from the \(N_\mathrm{f}=2+1\) average. Among the \(N_\mathrm{f}=2+1+1\) determinations only ETM 10 qualifies for an average.

Given this input our averaging procedure yields

$$\begin{aligned} \bar{\ell }_3\big |_{N_\mathrm{f}=2}=3.41(41),\quad \bar{\ell }_3\big |_{N_\mathrm{f}=2+1}=3.05(99), \end{aligned}$$
(67)
$$\begin{aligned} \bar{\ell }_4\big |_{N_\mathrm{f}=2}=4.62(22),\quad \bar{\ell }_4\big |_{N_\mathrm{f}=2+1}=4.02(28), \end{aligned}$$
(68)

where the errors include both statistical and systematic uncertainties. Again we plead with the reader to cite [241, 257] (for \(N_\mathrm{f}=2\)) or [25, 75, 249, 267] (for \(N_\mathrm{f}=2+1\)) when using these numbers. For \(N_\mathrm{f}=2+1+1\) we stay with the recommendation to use the results of [98], see Table 15 for the numerical values.

Let us add two remarks. On the input side our procedureFootnote 24 symmetrises the asymmetric error of ETM 09C with a slight adjustment of the central value. On the output side the error of the \(\bar{\ell }_3\) average for \(N_\mathrm{f}=2\) and of the \(\bar{\ell }_3,\bar{\ell }_4\) averages for \(N_\mathrm{f}=2+1\), according to the FLAG procedure, got inflated by hand to cover all central values. From these numbers (or from a look at Fig. 10) it is clear that the lattice results for \(\bar{\ell }_3\) do not show any obvious \(N_\mathrm{f}\)-dependence—thanks, chiefly, to our conservative error treatment strategy. On the other hand, in the case of \(\bar{\ell }_4\) even our practice of inflating the error of the \(N_\mathrm{f}=2+1\) average did not manage to avoid some mild inconsistency between the \(N_\mathrm{f}=2+1\) average on one side and either the \(N_\mathrm{f}=2\) or the \(N_\mathrm{f}=2+1+1\) average on the other side. Again, the dependence of the average on the number of active flavours is not monotonic, and this raises a decent amount of suspicion that some of the systematic errors might still be underestimated.

More specifically, it seems that again the \(N_\mathrm{f}=2+1+1\) value by ETM shows some tension relative to the average \(N_\mathrm{f}=2+1\) value quoted above, in close analogy to what happened for \(F\) or \(F_\pi /F\); see the discussion around (66). Since both \(F\) and \(\bar{\ell }_4\) are determined from the quark-mass dependence of the pseudoscalar decay constant, perhaps the formulae in Refs. [273, 274] for dealing with cutoff and finite-volume effects with twisted-mass data might prove useful in future analysis.

From a more phenomenological viewpoint there is a notable difference between \(\bar{\ell }_3\) and \(\bar{\ell }_4\) in Fig. 10. For \(\bar{\ell }_4\) the precision of the phenomenological determination achieved in Colangelo 01 [193] represents a significant improvement compared to Gasser 84 [58]. Picking any \(N_\mathrm{f}\), the lattice average of \(\bar{\ell }_4\) is consistent with both of the phenomenological values and comes with an error which is roughly comparable to the uncertainty of the result in Colangelo 01 [193]. By contrast, for \(\bar{\ell }_3\) the error of the lattice determination is significantly smaller than the error of the estimate given in Gasser 84 [58]. In other words, here the lattice really provides some added value.

We finish with a discussion of the lattice results for \(\bar{\ell }_6\) and \(\bar{\ell }_1-\bar{\ell }_2\). The LEC \(\bar{\ell }_6\) determines the leading contribution in the chiral expansion of the pion charge radius—see (63). Hence from a lattice study of the vector form factor of the pion with several \(M_\pi \) one may extract the radius \({\langle }r^2{\rangle }_V^\pi \), the curvature \(c_V\) (both at the physical pion-mass point) and the LEC \(\bar{\ell }_6\) in one go. Similarly, the leading contribution in the chiral expansion of the scalar radius of the pion determines \(\bar{\ell }_4\)—see (63). This LEC is also present in the pion-mass dependence of \(F_\pi \), as we have seen. The difference \(\bar{\ell }_1-\bar{\ell }_2\), finally, may be obtained from the momentum dependence of the vector and scalar pion form factors, based on the two-loop formulae of [237]. The top part of Table 16 collects the results obtained from the vector form factor of the pion (charge radius, curvature and \(\bar{\ell }_6\)). Regarding this low-energy constant two \(N_\mathrm{f}=2\) calculations are published works without a red tag; we thus arrive at the estimate

$$\begin{aligned} \bar{\ell }_6\big |_{N_\mathrm{f}=2}=15.1(1.2) \end{aligned}$$
(69)

which is represented as a grey band in the last panel of Fig. 10. Here we plead with the reader to cite [238, 257] when using this number.

The experimental information concerning the charge radius is excellent and the curvature is also known very accurately, based on \(e^+e^-\) data and dispersion theory. The vector form factor calculations thus present an excellent testing ground for the lattice methodology. The table shows that most of the available lattice results pass the test. There is, however, one worrisome point. For \(\bar{\ell }_6\) the agreement seems less convincing than for the charge radius, even though the two quantities are closely related. So far we have no explanation, but we urge the groups to pay special attention to this point. Similarly, the bottom part of Table 16 collects the results obtained for the scalar form factor of the pion and the combination \(\bar{\ell }_1-\bar{\ell }_2\) that is extracted from it.

Perhaps the most important physics result of this section is that the lattice simulations confirm the approximate validity of the Gell-Mann–Oakes–Renner formula and show that the square of the pion mass indeed grows in proportion to \(m_{ud}\). The formula represents the leading term of the chiral perturbation series and necessarily receives corrections from higher orders. At first non-leading order, the correction is determined by the effective coupling constant \(\bar{\ell }_3\). The results collected in Table 15 and in the top panel of Fig. 10 show that \(\bar{\ell }_3\) is now known quite well. They corroborate the conclusion drawn already in Ref. [278]: the lattice confirms the estimate of \(\bar{\ell }_3\) derived in [58]. In the graph of \(M_\pi ^2\) versus \(m_{ud}\), the values found on the lattice for \(\bar{\ell }_3\) correspond to remarkably little curvature: the Gell-Mann–Oakes–Renner formula represents a reasonable first approximation out to values of \(m_{ud}\) that exceed the physical value by an order of magnitude.

As emphasised by Stern and collaborators [279281], the analysis in the framework of \(\chi \)PT is coherent only if (i) the leading term in the chiral expansion of \(M_\pi ^2\) dominates over the remainder and (ii) the ratio \(m_{s}/m_{ud}\) is close to the value 25.6 that follows from Weinberg’s leading-order formulae. In order to investigate the possibility that one or both of these conditions might fail, the authors proposed a more general framework, referred to as “Generalised \(\chi \)PT”, which includes \(\chi \)PT as a special case. The results found on the lattice demonstrate that QCD does satisfy both of the above conditions—in the context of QCD, the proposed generalisation of the effective theory does not appear to be needed. There is a modified version, however, referred to as “Resummed \(\chi \)PT” [282], which is motivated by the possibility that the Zweig-rule violating couplings \(L_4\) and \(L_6\) might be larger than expected. The available lattice data do not support this possibility, but they do not rule it out either (see Sect. 5.2.4 for details).

5.2 SU(3) low-energy constants

5.2.1 Quark-mass dependence of pseudoscalar masses and decay constants

In the isospin limit, the relevant SU(3) formulae take the form [56]

$$\begin{aligned}&M_\pi ^2 \mathop {=}\limits ^{\mathrm{NLO}}2B_0m_{ud} \left\{ 1+\mu _\pi -\frac{1}{3}\mu _\eta +\frac{B_0}{F_0^2} \right. \nonumber \\&\quad \times \left. \left[ 16m_{ud}(2L_8-L_5)+16(m_{s}+2m_{ud})(2L_6-L_4)\right] \right\} \; \,\nonumber \\&M_{K}^2 \mathop {=}\limits ^{\mathrm{NLO}}B_0(m_{s}+m_{ud}) \left\{ 1+\frac{2}{3}\mu _\eta +\frac{B_0}{F_0^2}\right. \nonumber \\&\quad \times \left. \left[ 8(m_{s}+m_{ud})(2L_8\!-\!L_5)+16(m_{s}+2m_{ud}) (2L_6\!-\!L_4)\right] \right\} \;\,\quad \nonumber \\&F_\pi \mathop {=}\limits ^{\mathrm{NLO}}F_0 \left\{ 1\!-\!2\mu _\pi \!-\!\mu _K\!+\!\frac{B_0}{F_0^2}\left[ 8m_{ud}L_5\!+\!8(m_{s}\!+\!2m_{ud})L_4\right] \right\} \;\, \\&F_K\mathop {=}\limits ^{\mathrm{NLO}}F_0 \left\{ 1-\frac{3}{4}\mu _\pi -\frac{3}{2}\mu _K-\frac{3}{4}\mu _\eta +\frac{B_0}{F_0^2}\right. \nonumber \\&\quad \times \left. \left[ 4(m_{s}+m_{ud})L_5+8(m_{s}+2m_{ud})L_4\right] \right\} \;\,\nonumber \end{aligned}$$
(70)

where \(m_{ud}\) is the common up and down quark mass (which may be different from the one in the real world), and \(B_0=\Sigma _0/F_0^2\), \(F_0\) denote the condensate parameter and the pseudoscalar decay constant in the SU(3) chiral limit, respectively. In addition, we use the notation

$$\begin{aligned} \mu _P=\frac{M_P^2}{32\pi ^2F_0^2} \ln \!\left( \frac{M_P^2}{\mu ^2}\right) . \end{aligned}$$
(71)

At the order of the chiral expansion used in these formulae, the quantities \(\mu _\pi \), \(\mu _K\), \(\mu _\eta \) can equally well be evaluated with the leading-order expressions for the masses,

$$\begin{aligned}&M_\pi ^2\mathop {=}\limits ^{\mathrm{LO}}2B_0\,m_{ud},\quad M_K^2\mathop {=}\limits ^{\mathrm{LO}}B_0(m_{s} + m_{ud}),\nonumber \\&\quad M_\eta ^2\mathop {=}\limits ^{\mathrm{LO}}{\frac{2}{3}}B_0(2m_{s} + m_{ud}). \end{aligned}$$
(72)

Throughout, \(L_i\) denotes the renormalised low-energy constant/coupling (LEC) at scale \(\mu \), and we adopt the convention which is standard in phenomenology, \(\mu =770\,{\mathrm {MeV}}\). The normalisation used for the decay constants is specified in footnote 16.

5.2.2 Charge radius

The SU(3) formula for the slope of the pion vector form factor reads [152]

$$\begin{aligned} {\langle }r^2{\rangle }_V^\pi&\mathop {=}\limits ^{\mathrm{LO}} -\frac{1}{32\pi ^2F_0^2} \left\{ 3+2\ln \left( \frac{M_\pi ^2}{\mu ^2}\right) +\ln \left( \frac{M_K^2}{\mu ^2}\right) \right\} \nonumber \\&+\,\frac{12L_9}{F_0^2}, \end{aligned}$$
(73)

while the expression \({\langle }r^2\rangle _S^{\mathrm {oct}}\) for the octet part of the scalar radius does not contain any NLO low-energy constant at the one-loop order [152] (cf. 5.1.5 for the situation in SU(2)).

5.2.3 Partially quenched formulae

The term “partially quenched QCD” is used in two ways. For heavy quarks (\(c,b\) and sometimes \(s\)) it usually means that these flavours are included in the valence sector, but not into the functional determinant. For the light quarks (\(u,d\) and sometimes \(s\)) it means that they are present in both the valence and the sea sector of the theory, but with different masses (e.g. a series of valence quark masses is evaluated on an ensemble with a fixed sea quark mass).

The program of extending the standard (unitary) SU(3) theory to the (second version of) “partially quenched QCD” has been completed at the two-loop (NNLO) level for masses and decay constants [283]. These formulae tend to be complicated, with the consequence that a state-of-the-art analysis with \(O(2000)\) bootstrap samples on \(O(20)\) ensembles with \(O(5)\) masses each [and hence \(O(200'000)\) different fits] will require significant computational resources for the global fits. For an up-to-date summary of recent developments in Chiral Perturbation Theory relevant to lattice QCD we refer to [284].

The theoretical underpinning of how “partial quenching” is to be treated in the (properly extended) chiral framework is given in [285]. Specifically for partially quenched QCD with staggered quarks it is shown that a transfer matrix can be constructed which is not Hermitian but bounded, and can thus be used to construct correlation functions in the usual way.

5.2.4 Lattice determinations

To date, there are three comprehensive SU(3) papers with results based on lattice QCD with \(N_\mathrm{f}= 2 + 1\) dynamical flavours [15, 19, 79], and one more with results based on \(N_\mathrm{f}= 2 + 1 + 1\) dynamical flavours [156]. It is an open issue whether the data collected at \(m_{s} \simeq m_{s}^\mathrm{phys}\) allow for an unambiguous determination of SU(3) low-energy constants (cf. the discussion in [79]). To make definite statements one needs data at considerably smaller \(m_{s}\), and so far only MILC has some [15]. We are aware of a few papers with a result on one SU(3) low-energy constant each [78, 166, 253, 286] which we list for completeness. Some particulars of the computations are listed in Table 17.

Table 17 Lattice results for the low-energy constants \(F_0\), \(B_0\) and \(\Sigma _0\equiv F_0^2B_0\), which specify the effective SU(3) Lagrangian at leading order (MeV units). The ratios \(F/F_0\), \(B/B_0\), \(\Sigma /\Sigma _0\), which compare these with their SU(2) counterparts, indicate the strength of the Zweig-rule violations in these quantities (in the large-\(N_{c}\) limit, they tend to unity). Numbers in slanted fonts are calculated by us, from the information given in the quoted references

Results for the SU(3) low-energy constants of leading order are found in Table 17 and analogous results for some of the effective coupling constants that enter the chiral SU(3) Lagrangian at NLO are collected in Table 18. From PACS-CS [19] only those results are quoted which have been corrected for finite-size effects (misleadingly labelled “w/FSE” in their tables). For staggered data our colour-coding rule states that \(M_\pi \) is to be understood as \(M_\pi ^\mathrm{RMS}\). The rating of [15, 159] is based on the information regarding the RMS masses given in [37].

Table 18 Low-energy constants that enter the effective SU(3) Lagrangian at NLO (running scale \(\mu =770\,{\mathrm {MeV}}\)—the values in [15, 37, 56, 156, 159] are evolved accordingly). The MILC 10 entry for \(L_6\) is obtained from their results for \(2L_6 - L_4\) and \(L_4\) (and similarly for other entries in slanted fonts). The JLQCD 08A result [which is for \(\ell _5(770\,{\mathrm {MeV}})\) despite the paper saying \(L_{10}(770\,{\mathrm {MeV}})\)] has been converted to \(L_{10}\) with the standard one-loop formula, assuming that the difference between \(\bar{\ell }_5(m_{s} = m_{s}^\mathrm{phys})\) [needed in the formula] and \(\bar{\ell }_5(m_{s} = \infty )\) [computed by JLQCD] can be ignored

A graphical summary of the lattice results for the coupling constants \(L_4\), \(L_5\), \(L_6\) and \(L_8\), which determine the masses and the decay constants of the pions and kaons at NLO of the chiral SU(3) expansion, is displayed in Fig. 11, along with the two phenomenological determinations quoted in the above tables. The overall consistency seems fairly convincing. In spite of this apparent consistency, there is a point which needs to be clarified as soon as possible. Some collaborations (RBC/UKQCD and PACS-CS) find that they are having difficulties in fitting their partially quenched data to the respective formulae for pion masses above \(\simeq \)400 MeV. Evidently, this indicates that the data are stretching the regime of validity of these formulae. To date it is, however, not clear which subset of the data causes the troubles, whether it is the unitary part extending to too large values of the quark masses or whether it is due to \(m^\mathrm{val}/m^\mathrm{sea}\) differing too much from one. In fact, little is known, in the framework of partially quenched \(\chi \)PT, about the shape of the region of applicability in the \(m^\mathrm{val}\) versus \(m^\mathrm{sea}\) plane for fixed \(N_\mathrm{f}\). This point has also been emphasised in [244].

Fig. 11
figure 11

Low-energy constants that enter the effective SU(3) Lagrangian at NLO. The grey bands and black dots labelled as “our estimate” coincide with the results of MILC 09A [37] for \(N_\mathrm{f}=2+1\) and HPQCD 13A [156] for \(N_\mathrm{f}=2+1+1\), respectively

To date only the computations MILC 09A [37] (as an obvious update of MILC 09) and HPQCD 13A [156] are free of red tags. Since they use different \(N_\mathrm{f}\) (in the former case \(N_\mathrm{f}=2+1\), in the latter case \(N_\mathrm{f}=2+1+1\)) we stay away from averaging them. Hence the situation remains unsatisfactory in the sense that for each \(N_\mathrm{f}\) only a single determination of high standing is available. Accordingly, we stay with the recommendation to use the results of MILC 09A [37] and HPQCD 13A [156] for \(N_\mathrm{f}=2+1\) and \(N_\mathrm{f}=2+1+1\), respectively, as given in Table 18. These numbers are shown as grey bands in Fig. 11.

In the large-\(N_{c}\) limit, the Zweig rule becomes exact, but the quarks have \(N_{c}=3\). The work done on the lattice is ideally suited to disprove or confirm the approximate validity of this rule for QCD. Two of the coupling constants entering the effective SU(3) Lagrangian at NLO disappear when \(N_{c}\) is sent to infinity: \(L_4\) and \(L_6\). The upper part of Table 18 and the left panels of Fig. 11 show that the lattice results for these are quite coherent. At the scale \(\mu =M_\rho \), \(L_4\) and \(L_6\) are consistent with zero, indicating that these constants do approximately obey the Zweig rule. As mentioned above, the ratios \(F/F_0\), \(B/B_0\) and \(\Sigma /\Sigma _0\) also test the validity of this rule. Their expansion in powers of \(m_{s}\) starts with unity and the contributions of first order in \(m_{s}\) are determined by the constants \(L_4\) and \(L_6\), but they also contain terms of higher order. Apart from measuring the Zweig-rule violations, an accurate determination of these ratios will thus also allow us to determine the range of \(m_{s}\) where the first few terms of the expansion represent an adequate approximation. Unfortunately, at present, the uncertainties in the lattice data on these ratios are too large to draw conclusions, both concerning the relative size of the subsequent terms in the chiral perturbation series and concerning the magnitude of the Zweig-rule violations. The data seem to confirm the paramagnetic inequalities [281], which require \(F/F_0>1\) and \(\Sigma /\Sigma _0>1\), and it appears that the ratio \(B/B_0\) is also larger than unity, but the numerical results need to be improved before further conclusions can be drawn.

In principle, the matching formulae in [56] can be used to calculateFootnote 25 the SU(2) couplings \(\bar{l}_i\) from the SU(3) couplings \(L_j\). This procedure, however, yields less accurate results than a direct determination within SU(2), as it relies on the expansion in powers of \(m_{s}\), where the omitted higher-order contributions generate comparatively large uncertainties. We plead with every collaboration performing \(N_\mathrm{f}=2+1\) simulations to directly analyse their data in the SU(2) framework. In practice, lattice simulations are performed at values of \(m_{s}\) close to the physical value and the results are then corrected for the difference of \(m_{s}\) from its physical value. If simulations with more than one value of \(m_{s}\) have been performed, this can be done by interpolation. Alternatively one can use the technique of reweighting (for a review see e.g. [290]) to shift \(m_{s}\) to its physical value.

6 Kaon \(B\)-parameter \(B_K\)

6.1 Indirect CP-violation and \(\epsilon _{K}\)

The mixing of neutral pseudoscalar mesons plays an important role in the understanding of the physics of CP-violation. In this section we will only focus on \(K^0 - \bar{K}^0\) oscillations, which probe the physics of indirect CP-violation. We collect here the basic formulae; for extended reviews on the subject see, among others, Refs. [291293]. Indirect CP-violation arises in \(K_L \rightarrow \pi \pi \) transitions through the decay of the \(\hbox {CP}=+1\) component of \(K_L\) into two pions (which are also in a \(\hbox {CP}=+1\) state). Its measure is defined as

$$\begin{aligned} \epsilon _{K} = \dfrac{\mathcal{A} [ K_L \rightarrow (\pi \pi )_{I=0}]}{\mathcal{A} [ K_S \rightarrow (\pi \pi )_{I=0}]}, \end{aligned}$$
(74)

with the final state having total isospin zero. The parameter \(\epsilon _{K}\) may also be expressed in terms of \(K^0 - \bar{K}^0\) oscillations. In particular, to lowest order in the electroweak theory, the contribution to these oscillations arises from so-called box diagrams, in which two \(W\)-bosons and two “up-type” quarks (i.e. up, charm, top) are exchanged between the constituent down and strange quarks of the \(K\)-mesons. The loop integration of the box diagrams can be performed exactly. In the limit of vanishing external momenta and external quark masses, the result can be identified with an effective four-fermion interaction, expressed in terms of the “effective Hamiltonian”

$$\begin{aligned} \mathcal{H}_\mathrm{eff}^{\Delta S = 2} = \frac{G_\mathrm{{F}}^2 M_{{W}}^2}{16\pi ^2} \mathcal{F}^0 Q^{\Delta S=2} + \hbox {h.c.}. \end{aligned}$$
(75)

In this expression, \(G_\mathrm{{F}}\) is the Fermi coupling, \(M_{{W}}\) the \(W\)-boson mass, and

$$\begin{aligned} Q^{\Delta S=2}&= \left[ \bar{s}\gamma _\mu (1-\gamma _5)d\right] \left[ \bar{s}\gamma _\mu (1-\gamma _5)d\right] \nonumber \\&\equiv O_\mathrm{VV+AA}-O_\mathrm{VA+AV} \end{aligned}$$
(76)

is a dimension-six, four-fermion operator. The function \(\mathcal{F}^0\) is given by

$$\begin{aligned} \mathcal{F}^0 = \lambda _{c}^2 S_0(x_{c}) + \lambda _{t}^2 S_0(x_{t}) + 2 \lambda _{c} \lambda _{t} S_0(x_{c},x_{t}), \end{aligned}$$
(77)

where \(\lambda _{a} = V^*_{as} V_{ad}\), and \(a=c,t\) denotes a flavour index. The quantities \(S_0(x_{c}),\,S_0(x_{t})\) and \(S_0(x_{c},x_{t})\) with \(x_{c}=m_{c}^2/M_{{W}}^2\), \(x_{t}=m_{t}^2/M_{{W}}^2\) are the Inami–Lim functions [294], which express the basic electroweak loop contributions without QCD corrections. The contribution of the up quark, which is taken to be massless in this approach, has been taken into account by imposing the unitarity constraint \(\lambda _{u} + \lambda _{c} + \lambda _{t} = 0\). For future reference we note that the dominant contribution comes from the term \(\lambda _{t}^2 S_0(x_{t})\). This factor is proportional to \(|V_{cb}|^4\) if one enforces the unitarity of the CKM matrix. The dependence on a high power of \(V_{cb}\) is important from a phenomenological point of view, because it implies that uncertainties in \(V_{cb}\) are magnified when considering \(\epsilon _K\).

When strong interactions are included, \(\Delta {S}=2\) transitions can no longer be discussed at the quark level. Instead, the effective Hamiltonian must be considered between mesonic initial and final states. Since the strong coupling constant is large at typical hadronic scales, the resulting weak matrix element cannot be calculated in perturbation theory. The operator product expansion (OPE) does, however, factorise long- and short-distance effects. For energy scales below the charm threshold, the \(K^0-\bar{K}^0\) transition amplitude of the effective Hamiltonian can be expressed as

$$\begin{aligned}&{\langle }\bar{K}^0 \vert \mathcal{H}_\mathrm{eff}^{\Delta S = 2} \vert K^0{\rangle } = \frac{G_\mathrm{{F}}^2 M_{{W}}^2}{16 \pi ^2} \left[ \lambda _{c}^2 S_0(x_{c}) \eta _1 \, + \, \lambda _{t}^2 S_0(x_{t}) \eta _2\right. \nonumber \\&\qquad \left. + \, 2 \lambda _{c} \lambda _{t} S_0(x_{c},x_{t}) \eta _3\right] \nonumber \\&\qquad \times \left( \frac{\bar{g}(\mu )^2}{4\pi }\right) ^{-\gamma _0/(2\beta _0)} \exp \left\{ \int _0^{\bar{g}(\mu )} \, \hbox {d}g \, \left( \frac{\gamma (g)}{\beta (g)} + \frac{\gamma _0}{\beta _0g} \right) \right\} \nonumber \\&\qquad \times {\langle }\bar{K}^0 \vert Q^{\Delta S=2}_{R} (\mu ) \vert K^0 {\rangle } + \mathrm{h.c.}, \end{aligned}$$
(78)

where \(\bar{g}(\mu )\) and \(Q^{\Delta S=2}_{R}(\mu )\) are the renormalised gauge coupling and four-fermion operator in some renormalisation scheme. The factors \(\eta _1, \eta _2\) and \(\eta _3\) depend on the renormalised coupling \(\bar{g}\), evaluated at the various flavour thresholds \(m_{t}, m_{b}, m_{c}\) and \( M_{{W}}\), as required by the OPE and RG-running procedure that separates high- and low-energy contributions. Explicit expressions can be found in [292] and references therein, except that \(\eta _1\) and \(\eta _3\) have been recently calculated to NNLO in Refs. [295] and [296], respectively. We follow the same conventions for the RG-equations as in Ref. [292]. Thus the Callan–Symanzik function and the anomalous dimension \(\gamma (\bar{g})\) of \(Q^{\Delta S=2}\) are defined by

$$\begin{aligned} \dfrac{\hbox {d} \bar{g}}{\hbox {d} \ln \mu } = \beta (\bar{g}),\qquad \dfrac{\hbox {d} Q^{\Delta S=2}_{R}}{\hbox {d} \ln \mu } = -\gamma (\bar{g})\,Q^{\Delta S=2}_{R}, \end{aligned}$$
(79)

with perturbative expansions

$$\begin{aligned} \beta (g)&= -\beta _0 \dfrac{g^3}{(4\pi )^2} - \beta _1\dfrac{g^5}{(4\pi )^4} - \cdots \\ \gamma (g)&= \gamma _0 \dfrac{g^2}{(4\pi )^2} + \gamma _1\dfrac{g^4}{(4\pi )^4} + \cdots .\nonumber \end{aligned}$$
(80)

We stress that \(\beta _0, \beta _1\) and \(\gamma _0\) are universal, i.e. scheme-independent. \(K^0-\bar{K}^0\) mixing is usually considered in the naive dimensional regularisation (NDR) scheme of \({\overline{\mathrm{MS}}}\), and below we specify the perturbative coefficient \(\gamma _1\) in that scheme:

$$\begin{aligned}&\beta _0 = \left\{ \frac{11}{3}N_{c}-\frac{2}{3}N_\mathrm{f}\right\} , \nonumber \\&\beta _1 =\left\{ \frac{34}{3}N_{c}^2-N_\mathrm{f}\left( \frac{13}{3}N_{c}-\frac{1}{N_{c}}\right) \right\} , \nonumber \\&\gamma _0 = \frac{6(N_{c}-1)}{N_{c}}, \nonumber \\&\gamma _1 = \frac{N_{c}-1}{2N_{c}} \left\{ -21 + \frac{57}{N_{c}} -\frac{19}{3}N_{c} + \frac{4}{3}N_\mathrm{f}\right\} . \end{aligned}$$
(81)

Note that for QCD the above expressions must be evaluated for \(N_{c}=3\) colours, while \(N_\mathrm{f}\) denotes the number of active quark flavours. As already stated, Eq. (78) is valid at scales below the charm threshold, after all heavier flavours have been integrated out, i.e. \(N_\mathrm{f}= 3\).

In Eq. (78), the terms proportional to \(\eta _1,\,\eta _2\) and \(\eta _3\), multiplied by the contributions containing \(\bar{g}(\mu )^2\), correspond to the Wilson coefficient of the OPE, computed in perturbation theory. Its dependence on the renormalisation scheme and scale \(\mu \) is cancelled by that of the weak matrix element \({\langle }\bar{K}^0 \vert Q^{\Delta S=2}_{R} (\mu ) \vert K^0 \rangle \). The latter corresponds to the long-distance effects of the effective Hamiltonian and must be computed non-perturbatively. For historical, as well as technical reasons, it is convenient to express it in terms of the \(B\)-parameter \(B_{{K}}\), defined as

$$\begin{aligned} B_{{K}}(\mu )= \frac{{\left\langle \bar{K}^0\left| Q^{\Delta S=2}_{R}(\mu )\right| K^0\right\rangle } }{ {\frac{8}{3}f_{K}^2m_{K}^2}} \, \, . \end{aligned}$$
(82)

The four-quark operator \(Q^{\Delta S=2}(\mu )\) is renormalised at scale \(\mu \) in some regularisation scheme, for instance, NDR-\({\overline{\mathrm{MS}}}\). Assuming that \(B_{{K}}(\mu )\) and the anomalous dimension \(\gamma (g)\) are both known in that scheme, the renormalisation group invariant (RGI) \(B\)-parameter \(\hat{B}_{K}\) is related to \(B_{{K}}(\mu )\) by the exact formula

$$\begin{aligned} \hat{B}_{{K}}&= \left( \frac{\bar{g}(\mu )^2}{4\pi }\right) ^{-\gamma _0/(2\beta _0)}\nonumber \\&\times \exp \left\{ \int _0^{\bar{g}(\mu )} \, \hbox {d}g \left( \frac{\gamma (g)}{\beta (g)} + \frac{\gamma _0}{\beta _0g} \right) \right\} \, B_{{K}}(\mu ). \end{aligned}$$
(83)

At NLO in perturbation theory the above reduces to

$$\begin{aligned} \hat{B}_{{K}}&= \left( \frac{\bar{g}(\mu )^2}{4\pi }\right) ^{-\gamma _0/(2\beta _0)} \nonumber \\&\times \left\{ 1+\dfrac{\bar{g}(\mu )^2}{(4\pi )^2}\left[ \frac{\beta _1\gamma _0-\beta _0\gamma _1}{2\beta _0^2} \right] \right\} \, B_{{K}}(\mu ). \end{aligned}$$
(84)

To this order, this is the scale-independent product of all \(\mu \)-dependent quantities in Eq. (78).

Lattice QCD calculations provide results for \(B_K(\mu )\). These results, however, are usually obtained in intermediate schemes other than the continuum \({\overline{\mathrm{MS}}}\) scheme used to calculate the Wilson coefficients appearing in Eq. (78). Examples of intermediate schemes are the RI/MOM scheme [297] (also dubbed the “Rome–Southampton method”) and the Schrödinger functional (SF) scheme [87], which both allow for a non-perturbative renormalisation of the four-fermion operator, using an auxiliary lattice simulation. In this way \(B_K(\mu )\) can be calculated with percent-level accuracy, as described below.

In order to make contact with phenomenology, however, and in particular to use the results presented above, one must convert from the intermediate scheme to the \({\overline{\mathrm{MS}}}\) scheme or to the RGI quantity \(\hat{B}_{K}\). This conversion relies on one- or two-loop perturbative matching calculations, the truncation errors in which are, for many recent calculations, the dominant source of error in \(\hat{B}_{{K}}\) [25, 77, 298300]. While this scheme-conversion error is not, strictly speaking, an error of the lattice calculation itself, it must be included in results for the quantities of phenomenological interest, namely \(B_K({\overline{\mathrm{MS}}},2\,\mathrm{GeV})\) and \(\hat{B}_{K}\). We note that this error can be minimised by matching between the intermediate scheme and \({\overline{\mathrm{MS}}}\) at as large a scale \(\mu \) as possible (so that the coupling constant which determines the rate of convergence is minimised). Recent calculations have pushed the matching \(\mu \) up to the range 3–3.5 GeV. This is possible because of the use of non-perturbative RG running determined on the lattice [25, 301]. The Schrödinger functional offers the possibility to run non-perturbatively to scales \(\mu \sim M_{{W}}\) where the truncation error can be safely neglected. However, so far this has been applied only for two flavours of Wilson quarks [302].

Perturbative truncation errors in Eq. (78) also affect the Wilson coefficients \(\eta _1\), \(\eta _2\) and \(\eta _3\). It turns out that the largest uncertainty comes from that in \(\eta _1\) [295]. Although it is now calculated at NNLO, the series shows poor convergence. The net effect is that the uncertainty in \(\eta _1\) is larger than that in present lattice calculations of \(B_K\).

The “master formula” for \(\epsilon _{K}\), which connects the experimentally observable quantity \(\epsilon _{K}\) to the matrix element of \(\mathcal{H}_\mathrm{eff}^{\Delta S = 2}\), is [293, 303305]

$$\begin{aligned} \epsilon _{K}&= \exp (i \phi _\epsilon ) \,\, \sin (\phi _\epsilon ) \,\, \left[ \frac{\hbox {Im} [ {\langle }\bar{K}^0 \vert \mathcal{H}_\mathrm{eff}^{\Delta S = 2} \vert K^0 {\rangle }]}{\Delta m_K }\right. \nonumber \\&+\left. \rho \frac{\hbox {Im}(A_0)}{\hbox {Re}(A_0)}\right] , \end{aligned}$$
(85)

for \(\lambda _{u}\) real and positive; the phase of \(\epsilon _{K}\) is given by

$$\begin{aligned} \phi _\epsilon = \arctan \frac{\Delta m_{K}}{\Delta \Gamma _{K}/2}. \end{aligned}$$
(86)

The quantities \(\Delta m_K\equiv m_{K_L}-m_{K_S}\) and \(\Delta \Gamma _K\equiv \Gamma _{K_S}-\Gamma _{K_L}\) are the mass- and decay width-differences between long- and short-lived neutral Kaons, while \(A_0\) is the amplitude of the Kaon decay into a two-pion state with isospin zero. The experimentally measured values of the above quantities are [74]:

$$\begin{aligned} \vert \epsilon _{K} \vert&= 2.228(11) \times 10^{-3}, \nonumber \\ \phi _\epsilon&= 43.52(5)^\circ , \nonumber \\ \Delta m_{K}&= 3.4839(59) \times 10^{-12}\, \mathrm{MeV}, \nonumber \\ \Delta \Gamma _{K}&= 7.3382(33) \times 10^{-12} \,\mathrm{MeV}. \end{aligned}$$
(87)

The second term in the square brackets of Eq. (85), has been discussed and estimated, e.g., in Refs. [305, 306]. It can best be thought of as \(\xi + (\rho -1)\xi \), with \(\xi =\mathrm{Im}(A_0)/{Re}(A_0)\). The \(\xi \) term is the contribution of direct CP violation to \(\epsilon _K\). Using the estimate of \(\xi \) from Ref. [306] (obtained from the experimental value of \(\epsilon '/\epsilon \)) this gives a \(\sim -6.0(1.5)\,\%\) correction.Footnote 26

The \((\rho -1)\xi \) term arises from long-distance contributions to the imaginary part of \(K^0 -\bar{K}^0\) mixing [305] [contributions which are neglected in Eq. (78)]. Using the estimate \(\rho =0.6\pm 0.3\) [305], this gives a contribution of about +2 % with large errors. Overall these corrections combine to give a \((4 \pm 2)\,\%\) reduction in the prediction for \(\epsilon _K\). Although this is a small correction, we note that its contribution to the error of \(\epsilon _K\) is larger than that arising from the value of \(B_{K}\) reported below.

6.2 Lattice computation of \(B_{{K}}\)

Lattice calculations of \(B_{{K}}\) are affected by the same systematic effects discussed in previous sections. However, the issue of renormalisation merits special attention. The reason is that the multiplicative renormalisability of the relevant operator \(Q^{\Delta S=2}\) is lost once the regularised QCD action ceases to be invariant under chiral transformations. For Wilson fermions, \(Q^{\Delta S=2}\) mixes with four additional dimension-six operators, which belong to different representations of the chiral group, with mixing coefficients that are finite functions of the gauge coupling. This complicated renormalisation pattern was identified as the main source of systematic error in earlier, mostly quenched calculations of \(B_{{K}}\) with Wilson quarks. It can be bypassed via the implementation of specifically designed methods, which are either based on Ward identities [309] or on a modification of the Wilson quark action, known as twisted-mass QCD [310, 311].

An advantage of staggered fermions is the presence of a remnant \(U(1)\) chiral symmetry. However, at non-vanishing lattice spacing, the symmetry among the extra unphysical degrees of freedom (tastes) is broken. As a result, mixing with other dimension-six operators cannot be avoided in the staggered formulation, which complicates the determination of the \(B\)-parameter. The effects of the broken taste symmetry are usually treated via an effective field theory, such as staggered Chiral Perturbation Theory (S\(\chi \)PT).

Fermionic lattice actions based on the Ginsparg–Wilson relation [